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PREFACE 


During last fifty years there has been tremendous development both in theory 
and methods of sample surveys to meet the growing need of reliable data. In the 
countries of the world having planned economies, efficient collection of 
reliable data in the background of varying environmental, socio-economic and 
cultural scenario has become a prerequisite for the purpose of better planning. 
To meet this challenge, the sampling experts around the world have come 
forward with a number of excellent books in survey sampling incorporating 
scientific techniques which provide foundation both for teaching and research 
besides being used as hand books for sample survey practitioners. The present 
book in sampling theory is also an attempt in this direction and contains twelve 
chapters including two introductory chapters on preliminaries and basic 
concepts. While presenting the materials, care has been taken to systematise the 
chapterisation in such a way that the reader first gets a view of the basic 
sampling designs before proceeding to know about the use of auxiliary 
information in forming more efficient estimates in both simple and complex 
designs. Last three chapters deal with the problems of estimation for domains, 
variance estimation in complex designs and non-sampling errors. 

The readers have been exposed to recent developments in different 
aspects of sample surveys without presenting the proofs of complicated 
theorems, while simultaneously ensuring that the continuity of relevant 
discussions is not affected. 

The book is written in a lucid style and contains a fair amount of recent 
literature and references on the subjects to benefit both students and research 
scholars. 


A. K. P. C. Swain 

Bhubaneswar 
April 2003 
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1. PRELIMINARIES 


i 1.1 What is a Survey? 

4 

The term survey implies collecting information either qualitative or 
quantitative on a finite set or subset of aggregate of animate or inanimate units 
constituting a population. For example, we may be interested in collecting 
information either on a set of persons or a set of animals or a set of plants or a set 
of households or a group of villages/cities or a group of business establishments 
or educational institutions, etc. The purpose of survey is to provide interested 
: parties such as government departments, business concerns or research 
institutions with required information to be possibly used for future planning or 
j to know its present status. An ill conceived launching of a survey will lead to 
5 waste of time and energy and accumulation of huge mass of unwanted 
information. This necessitates a good deal of meticulous planning and proper 
. thoughts on the part of a survey specialist before starting survey operations. 
Survey may either cover the entire population in question called census 
surveys or be based on a part of the population named as sample surveys. 
Surveys are varied in nature and may be conducted in variety of fields. For 

* instance, the surveys may be conducted as census of human population or 
study of certain aspects of concerned human population, geological surveys 
to detect mineral deposits, engineering surveys to determine the measures for 
flood control, soil surveys to detect soil types for agricultural practices, 
quality control surveys for the acceptance of manufactured products, traffic 
flow surveys, biological surveys relating to agriculture, forestry, fishery, wild 
lives, ecology, epidemiological surveys, surveys of industrial and 

* manufacturing industries as to product and sales, consumer surveys, public 
health surveys, public opinion surveys, educational surveys, demographic 

1 surveys, housing surveys, etc. 

? 

I 

1.2 Complete Enumeration vs Sample Survey 

j Completely enumerating all the units of the population is undertaken when one 
needs to obtain exact information. However, information gathered through 
complete enumeration is not without error. There may be observational errors 
and also errors due to non-observation. Further, the effort, money and time is 
bound to be extremely large in case of very large population. Moreover, one 



2 | Finite Population Sampling 


might not always be very much interested to have 100% accurate inf 

for planning and sometimes information with allowable or permissible^^' 00 

of error serves the desired objectives. mar gin 

As an alternative to complete enumeration, a sample survey ma 
conducted with the help of a sample of units to draw inferences™^ IT 
characteristics of a finite population. In practice, a sample survey deals with 6 
very much less number of units compared to complete enumeration and as such 
becomes a feasible proposition in the face of generally limited amount of 
financial resources, professional manpower, survey materials, data processing 
equipments, etc. Further, in a sample study the errors which have chance to 
increase with increase in size of sample, play a less dominant role in vitiating 
results compared to situation in a complete enumeration or census. 

Besides having advantages as regards adaptability, speed and economy, 
the sampling procedure has greater flexibility and greater scope in terms of the 
subject coverage. 

The foregoing discussion does not imply that complete enumeration is 
always unwarranted. Often, we come across situations requiring complete 
census of human population and live stock population, agriculture, industry, 
assessment of income tax, recruitment of personnel for a particular 
establishment, etc. While carrying out complete census enquiry, the survey 
scientist may like to collect information from a sample on some special needs 
and problems. For instance, in a population census enquiry on households to 
know the total population on a reference date, we may be interested to enquire 
about income and expenditure particulars from a sample of households to 
compute per capita income or in an agricultural census we may like to take a 
sample to study the indebtedness of farms, etc. 


1.3 Major Steps in a Sample Survey 

The conduct of a sample survey usually consists of the following major steps: 
(a) planning, (b) execution, (c) processing and analysis of data, (d)writing ofthe 
report. 

During planning stage, the whole operations beginning from the 
determination of objectives and fixation of population under enquiry till the 
finalisation of report, should be kept in mind before execution of the survey- 
The questions as to (i) the types of data to be collected, (ii) the appropriate 
precision to be fixed for the sample results, (iii) the methods of data 
collection-physical measurement, personal interview, mailed enquiry, , 
transcription from secondary sources such as official and non-official records 
either in printed or manuscript forms, diaries and notes, etc., (iv) the 
determination of sampling frame and its construction, if a suitable one does not 
exist, (v) sample size, (vi) determination of sampling units and, (vii) selection 
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of sampling design, (viii) organization of field work are some of the major 
considerations for conducting any survey. Preparation of schedules or 
questionnaires needs to be given proper consideration in planning a survey, 
because the reliability and utility of the results depend to a large extent on this. 
The questions to be framed for eliciting information from the respondents 
should be simple and unambiguous and also be so worded as not to influence 
the answers of the respondents. It is desirable to avoid questions which may 
arise suspicions in the minds of respondents, leading to unreliable answers, 
j This will depend on the wording, order and content of questions in the 
J questionnaire. The items in the questionnaire should be grouped and coded 
1 properly to facilitate the work of tabulation. Th ^pretesting of the questionnaire 
- may be necessary to test the effectiveness of the questionnaire. Apilot survey 
4 may be taken up before the main survey to have required knowledge about the 

* peculiarities of the sampling units, problems relating to organization of field 
work, nature of response and nonresponse errors and cost considerations in the 

! field work. This would help the survey scientists to devise solutions for tackling 
j the impending problems in the main survey. Proper selection, training of 
’ investigators, efficient organization of field work and timely supervision 

* contribute towards obtaining reliable data. 

Another aspect which may require special attention is about the 
determination of (i) survey period, that is, the time period during which the 
survey data are to be collected, (ii) reference period, the time period to which 
the data for all units should refer, (iii) the reporting period, which may coincide 
or be a part of the reference period for which the required statistical information 
is collected for a unit at a time. 

Processing of data and its subsequent analysis should be carried out, 
keeping in view the objectives with which the survey was started. 

The report should be written in a lucid style, exhibiting the analysis 
depending on the objectives and survey data only. The content, also should vary 
according as whether the report is to be used by decision making authorities or 
^ for the purpose of research. 

j 1.4 Probability Sampling vs Non-probability Sampling 

Probability sampling is a method of selection of sample from the population 
i satisfying certain specific criteria. In this method it is possible to define a set of 

■ distinct samples s i, si, . . Sk, say, such that each possible sample (s) has a 

definite and known probability of selectionp(s) and each element or unit of the 
population having a non-zero probability of selection. The selection of a 
particular sample is carried out in such a way that each possible sample has 
preassigned probability of selection. A sample realized in such a manner is 
called a probability sample or sometimes called a random sample. 
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As compared to probability sampling, the non-probability sampling does 
not incorporate any chance mechanism into method of selection of a sample I n 
this method a sample may either be selected in a haphazard manner minimisin * 
the elements of human bias or may be selected keeping in view the accessibilit 
and representativeness of different parts of the population. Sometimes personal 
judgement and definite purposes under consideration may help to select a ‘ 
sample. This method of sampling is popularly known as judgement or 
purposive sampling. 


Non-probability sampling, although easy to execute, is beset with a 
fundamental difficulty that it is not amenable to the mathematical theoiy of < 
sampling, which presupposes the drawing of units with definite and calculable 
chance or probability. This leads to ignorance about precision of results 
obtained from the sample. However, non-probability sampling like quota * 
sampling, judgement sampling or purposive sampling are preferred where there 
is non availability of a sampling frame and conditions for adopting a probability 
sampling are non-existent and moreover when quick results of the survey (as in 
case of opinion surveys) are required. 


1.5 The Frame . 

f 

A frame is a list of sampling units of the population from which a random • 
sample is to be drawn. The construction of a frame is an essential part of any f 
sampling enquiry based on probability sampling. If a readymade frame is not 
available the sampler has a prime duty to construct a frame for the purpose. ' 
Frames may not necessarily be a list only. It may also be maps showing the 
boundaries of area units. Thus, frame may be broadly classified into two types - 
list frame consisting of sampling units which may be either elementary units or * 
groups of elementary units called clusters , having proper identification j * 
particulars and area frame or map frame , consisting of geographical area units 
located unambiguously on a map. In a particular problem under investigation 
more than one type of frame may be required under a complex sampling design , 
to reach ultimate units of study. For instance, in an agricultural survey for 
estimating yield per acre of rice in an administrative district, one may need 
frame for the subdivisions of the district, frame for the blocks within the 
selected sub divisions, frame for the villages within the selected blocks and area 
frame for the fields within selected villages in a four stage sampling set up. 
Information about target population may be obtained through the 
corresponding frame population. For example, in an income survey of 
individuals in a locality (target population) a suitable list may not be available, f | 

but the list of households (frame population) may be available from some | 

source to help random sampling of households, which ultimately provides ( i 

information about individual incomes. ! | 
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In certain cases the sampling frame may include besides having 
identification particulars, some auxiliary information which may be exploited 
for increasing the efficiency of selection of sampling units and of inference 
based on the sample observations. 

More about frames have been discussed by Mahalanobis (1944), Yates 
(1949), and Hansen, Hurwitz and Madow (1953). 

1.6 Errors in Surveys 

Two types of errors creep into sample surveys: sampling error and 
non-sampling error. The sampling error arises because a sample of units is 
observed instead of complete enumeration of all units in the population. The 
non-sampling error has nothing to do with the sampling process and can creep 
into at every stage of the survey beginning from the planning stage till the 
completion of-the survey report. This is categorized as (a) errors at the planning 
stage such as wrong formulation of problems, ambiguous concepts, ill-defining 
the population, use of defective frames for selection of sample, defect in the 
questionnaires or schedules, wrong selection, inadequate training and 
supervision of interviewers, etc., (b) errors of measurement or observational 
errors due to faulty measuring devices, wrong answers, interviewer’s bias, etc., 
(c) failure to measure or observe some of the units which are in the sample due 
to non-coverage, defective frame, deaths or not-at-homes, unable to answer or 
refusal to answer questions etc. in case of human population and due to 
destruction of crop for various reasons in case of crop surveys, (d) errors in data 
processing such as editing, coding, tabulating, mathematical calculations and 
printing of reports. 

The sampling error or more clearly random sampling error is always 
discussed in relation to an estimator, a function of sample observations 
representing unknown population values such as total, mean, variance, 
coefficient of variation, etc. This is measured by calculating mean square error 
(m.s.e) which is the expected value of the squared deviation of the estimate 
from the true value and is expressed in terms of squared units. The reduction in 
sampling error is achieved by increasing the sample size, changing the 
sampling design, adopting the technique of stratification and also by the use of 
auxiliary information either at design stage or at the estimation stage or at both. 

As regards non-sampling error situation becomes completely opposite in 
the sense that the non-sampling error is likely to increase with the increase in 
sample size and further, it is more likely that the complete enumeration or 
census might have greater non-sampling error than in case of sample survey. 
Careful planning and execution of a sample survey reduces the non-sampling 
error to a large extent. Proper selection and training of interviewers and timely 
supervision and cross-checks provided in the questionnaires, use of repeated 
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measurements, and determination of the effect of measurement error through a 
suitable model improve the reliability of the estimate to a large extent. 

Non-response is another source of non-sampling error, defined as a failure 
to obtain a desired and usable report from a reporting unit. Such phenomenon 
invariably exists in almost all surveys, specially on human beings such as 
household surveys. The presence of non-response worries the investigator as to 
the usefulness of survey estimates and the unrealiability of the results increases 
with the increase in the rate of non-response. This calls for introducing effective 
measures to reduce the non-response to an insignificant level during planning 
and at the time of collection of data. 

1.7 Variance and Cost Functions 

To design and conduct a sample survey more efficiently, the investigator 
should have a clear or approximately clear idea about variance and cost 
functions associated with the survey. The sampling error measured by the 
sampling variance or mean square error, generally, decreases with the increase 
in sample size. The cost a survey has different components associated with 
different aspects of a survey and increases with sample size. Both variance and 
cost would also depend on nature of the sampling unit. Thus, it is necessary to 
look into both variance and cost aspects to arrive at an optimum sampling unit 
and optimum sample size. Mahalanobis (1944) in his epic work on crop surveys 
as considered in detail about different aspects of variance and cost functions. 

I he forms of variance and cost functions may be determined by either 
<nr!pvc Ve emp,ncal studies or from the experiences gained in similar past 



2. BASIC CONCEPTS IN SURVEY 
SAMPLING 


2.1 Population and Sample 

A finite population U is a labelled collection of N distinguishable and 
identifiable elements 

U= {wi, U2, . . Un}, 

whose kx h element is denoted by Wk- As elements are labelled from 1 to N and 
with one to one correspondence between elements and labels, we may writ eU= 
{1,2, . . /V}, where N < oo and is called size of the finite population. 

A sample is a part of a population and naturally a sub set of U and is an 
ordered collection of distinct elements, which may be drawn one after another 
from U. The number of units in sample s is called sample size/7(s). Naturally 
n(s) < /V. The samples is generally represented in an ordered sequence of labels 
s*, order being the order of draws. The notation s for the sample is reserved for 
an unordered set s of elements, which may be arranged either in increasing or in 
decreasing order of magnitude of labels. 

More specifically, let z t be the label of the unit drawn at therth draw. Then, 
an ordered samples* of siz e«(s*) from U is an ordered sequence of labels from 
U, represented by 

s* = (z, ,i 2 , — Jt >---hi(s*))> where 1 < t < n (s*), and E U. 

Let S * = {s*} denote the totality of all sequences of the forms*. S* is 
called the sample space. If sampling is with replacement, the labels may not be 
all distinct. Denoting y(s*) as number of distinct labels, called effective sample 
size, v(s*)<«(s*). In sampling without replacement the labels are bound to 
be distinct, that is, 1 < i\ * z 2 * ... ^ /„(.»*) < N. 

If the order and repeats of the labels drawn are ignored, we shall have a set 
samples for each sequence samples*. Thus, for a with replacement sample ofn 
draws, set samples may be represented as 

= (J\J2, • • y'v(.v)), 1 <y'i < . . . <y' V (. v) < V, 

where v(s) is number of distinct units in s, and for a without replacement 
sampling of n units 

■V = (J\,j2, - • \ <j]<. . .< y n < N. 
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S = {.v}. The set j may be obtained by reducing a sequence 5 *, that is 
mapping S* onto S. Thus for every s* e S*, there is a 5 e S. 


by 


Example 1 . Let U= { 1,2, 3} representa finite population having 3 Unit 
Then s* 1 =(1,2,3), 5*2 = (2,3), j*3 = (1) are samples from U where n (,*,) 1' 
3, n(s* 2 ) = 2 and n(s* 3 ) = 1 . 


There are six equivalent forms ofs*i such as (1,2,3), (1,3,2), (2,1,3) p 
3, 1), (3,1,2) and (3,2,1) and two equivalent forms ofs *2 such as (2,3) and (3 
2). 

There are fifteen sequence samples of all possible sizes, that can be drawn 
out of U without repeats. These are: (1), (2), (3), (1, 2), (2, 1), (1, 3), (3, 1) (2 

3) , (3,2), (1 ,2, 3), (1, 3, 2), (2, 1 ,3), (2, 3, 1), (3, l, 2 ) and (3,2, 1), out of which 
only seven are distinct set sampies: (1), (2), (3), (1 ,2), (1 ,3), (2,3) and (1 , 2,3). 

Example 2. Let U = (1, 2, 3, 4). For a with replacement sampling with sample 
size 2, there are 4 x 4 = 16 possible sequence samples and for a without 
replacement sampling, disallowing repetitions (1, 1 ), (2,2), (3,3) and (4,4), the 
possible sequence (taking into account order of the draws) samples are 4x 3 
= 12. These are: 5*1 =(1,2), 5*2 = (2, 1), s* 3 = (1,3), 5*4 = (3, 1), 5*5 = (1, 

4) , 5*6 = (4, 1), 5*7 = (2,3), .y* 8 = (3, 2), 5*9 = (2, 4), s* ]0 = (4, 2), 5*11 = (3,4) 
and 5 * i 2 = (4, 3). It may be seen that not all of them are distinct samples. In fact, 
distinct set samples (s) are 6 in number. That is, 12 sequence samples fy*) are 
mapped onto 6 set samples { 5 }. That is 


V* 

A ) 

= (1,2) 


= (1.2); 

$ 3 “ 

: (1,3) 


j 2 

= (2,1) 

s \ 

i 4 - 

s 2 = 

: (3,1) 

(1,3) 

i 5 

= (1,4) 


= (1.4); 

s 7 ‘ 

= (2,3) 


5 6 

= (4,1) 

s 3 

5 8 - 

= (3,2) 

= (2,3) 

5 9 

= (1,4) 


= (2,4); 

r.* 

* 1 1 

= (3,4) 


■* 10 

= (4,1) 

s 5 

(i * 

j 12 

S'h 

= (4,3) 

= (3,4) 


In general for a with replacement sampling o in draws out of N units, there will 
be N" possible sequence samples with repeats andM" )= 7V(yV- 1 ) (N-n + 1) 
possible sequence samples without repeats, which are equivalent in sets oinl 

Thus, in sampling without replacement there will beM")/«!= sets of s. 

\n) 


2.2 Sampling Design 

finite ^^^, san [P^ either s (or its reduced form 5 ) is selected from a given 

samDrmr!L a - ,0n tu r ° Ugh a P robabili *y or chance mechanism, specified by a 
g esign. Thus, it is a logical necessity to define a sampling design with 
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the help of totality of all samples S* = {j*} and certain probability measure/ 5 * 
= (Ms*)} defined on S*. 

Definition 2.1: An ordered sampling design D* is a pair ( S*,p *) based on U, 
where S*={s* } is the totality or collection of all possible samples s* and /?* is a 
probability measure {/?(s*)} defined on S* such that 

p(s*) > 0 for all s*e S* and ^p(s*)=l 

s * eS* 


Notes 

1. As the concept of sampling design boils down to defining a function 
P* = {p(s*) on S* satisfying/?^*) > 0 and I p(s*) = 1, some authors refer 
to p* as the sampling design satisfying aforesaid conditions. 

2. The subset S* p * = {s* e S* | p(s*) > 0} of S* is called support of P* 
(effective sample space with respect to P*). The cardinality of S* p * is 
called the support size of design/?*. 

3. Let n = n(s*) refer to the size of 5*. The effective sample size \s*\ is the 
cardinality ofs*, i.e. number of distinct units in s*. 

Definition 2.2: An unordered sampling design D(S, P ) is a collection of 5= (j) 
with a probability measure, P = {/?($)} defined on S satisfying/?^) > 0 and Z 

s eS 

p(s) = 1. For every ordered sample s* there exists an unordered samples by 
ignoring the order in which the labels occur. It is evident that/?(s) = E/?(s*), 

where summation is over all thoses*, which result ins. 

If the sampling design does not depend on main variate values associated 
with the labels in s or s*, such a design is called a non-informative sample 
design. However, a non-informative sample design might depend on the 
auxiliary values associated with the labels for which advance information is 
available. 

Informative designs are those sequential designs where the sampler 
decides what labels to draw next after observing drawn labels and associated 
values of the characteristic (y) of interest (Basu, 1969; Zacks, 1969). 

Definition 2.3: An ordered design is called a fixed size (FS) design if« ( 5 *) is 

constant for all s* e 5* such that/?(**) > 0. If n(s*) = the design is called a 
FS (n) design. 

Definition 2.4: An ordered design is said to be fixed effective size design (FES) 
if v(s*) = constant for all j*. If the constant is«, then it is called FES(«) design. 
The effective sample size relates to number of distinct units in the sample. 

Similar definitions may be given for unordered design. 
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2.3 The Sampling Mechanism 

Given a sampling design (! S *, p*) it is always possible to draw a probability 
sample s* by the following mechanism: 

a) Identify all possible samples.?*, say k in number. Arrange them serially 

from 1 to k say s*{ 1), s* (2), . . s*(k ) with associated probabilities 

p(s*(j))J = 1. • • •»*• 

b) Calculate cumulative probabilities 

C, = 

/ = i 

c) Select a random number R such that 0 < R < 1 and select a sample s*(i ) if 
Ci -1 < R<C h with Co = 0. 

Similar procedure may be adopted to select a samples, given a sampling 
design D(S, P). Alternatively s may be reduced from.?*. 

Although the above method sounds simple to draw a probability sample, it 
becomes inconvenient, tedious and time consuming if/V is large. Alternatively 
one may draw units one after another to meet the require- ments of the given 
sampling design, Hanurav (1962) has shown that for any given sampling design 
there exists at least one sampling scheme which realises this sampling design. 
In a given situation to draw a probability sample we need (a) a list ofN units of 
population serially numbered from 1 to N. (b) a sampling mechanism to draw 
units one by one. (c) any published random number table such as ones 
published by Tippett (1927), Fisher and Yates (1938), Kendall and Smith 
(1939), Rand Corporation (1955), Rao, Mitra and Mathai (1966) and others. 

The successive draws following a particular sampling scheme may be 
drawn with or without replacement of the units selected in the previous 
draws, giving rise to sampling with replacement or sampling without replace¬ 
ment respectively. Letp(/ t , | / 1 , ii, • . //- 1 ) be the conditional probability of 

drawing it at the rth draw given that the units /'i, /?,..., u -i have already been 
drawn at first, second,. .., (/- l)th draws respectively. 

Definition 2.5: A sampling mechanism or scheme specifies a probability 
function p(i\, h ,. . ., U-\) such that 

i) p{ //, | / , i 2 , . . ii-\) > 0 for all t > 1, 

ii) Z p(i, i\ 9 h ,..) = 1 forallsetsof integers/), / 2 ,with 1 

i, 

For equal probability sampling (Simple Random Sampling) 
replacement schemep( /'/, | /'i, i 2 , . . ., /,_i) = 1 IN. and for equal probability 
sampling without replacement scheme/?( /,, | i u / 2 ,. . ., /,_i) = \!{N-1 + 0- ,n 
case of unequal probability sampling with replacement with pj aS 
probability of selecting^ th unit, p( i h | /,, / 2> . . /,_,) = p . and for unequal 
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probability sampling without replacement, 

P( iI , I II, 12, ■ • •, 1,-1) =-^- 

Q-Pi-Ph - --Pi,.,) 

Illustration. Supp.ose the finite population (/has 3 units 1,2,3. For a sample of 
size 2, with replacement sequence samples are 

s*\=(\,\), s* 2 = (2,2), 5*3 = (3,3) 

s*4 = (1,2), 5*5 = (2,1), 5*6 = (1,3), 

5*7 = (3,1), 5*8 = (2, 3), 5*9 = (3, 2). 

Thus for WR equal probability sampling 

P(5*i) =1/9, /= 1,2, 3.9, 

and for WOR equal probability sampling 

pC?*i) =0 for/'=1,2, 3 

= 1/6 for /' = 4,5,. . ., 9. 

Mapping S* onto S, we have for an unordered WOR equal probability sampling 
we have 3C2 = 3 set samples ( 5 ) 

p(5l)=/?(5*4)+/?(5* 5 )= 1/3, 

P(S 2 ) =p(s*6) + p(s* 7 ) = 1/3, and 

p(53) =p(s*s) + 77 ( 5 * 9 ) =1/3. 

Hanurav (1966) has proposed a unit drawing chance mechanism with the 
help of an algorithm A = { q \, q 2 (s*\ < 73 }, where (1) (, S* u q\) is a sampling 
design defined on U or a subset of U with S* 1 c {(/)| 1< / < TV}, that is, q\ is a 
probability measure defined on U with 0 < q\ ((/')) <1, (/') e S* 1 , and Z ?,{(/)) 

= 1. (0 e5* 

2 . / 72 c?*) is a non-negative function defined for all sample 5 *e 5 * and 0 
< qi(s*) < 1, for all s* e S*. 

3. For 672 ( 5 *) * 0, ^3[‘S'*2(^*), < 7.1 ((/) I s*)] is a sampling design on U-s* or 
any subset of it withS *2 Cs*)* {(/') | e (7-5*}, that is, 0<^3 ((/') | 5 *)< 1 (/') 

<S* 2 (s*), ^c,Aj)\s*) !. 

j eS*(s*) 

The sampling scheme is implemented through the following steps: 

(a) Draw the first unit by using sampling design (?*i, < 71 ). Let n be the unit 
drawn and denote the sample so obtained as 5*1 = i\. 

(b) Next, impute the valuer* 1 in < 72 . If < 72 ( 5 * 1 ) = 0, sampling is terminated. If 
< 72 ( 5 * 1 ) — 1, proceed to next step. If 0 < < 72 ( 5 * 1 ) < 1, conduct a binomial trial 
with probability of success as ( 72 ( 5 * 1 ). If it results in a failure, the sampling 
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is terminated with s* i as the ultimate sample. If the trial results in a 
success, proceed to next step. 

(c) Draw a second unit from the remaining U- s\* units, using sampling 
design < 73 [S* 2 ( 5 * 1 ), <73 ((/') I s*i)L 

Let h O i\) be the second unit thus selected. The combined sample having two 
units is s *2 = (/ 1 , h). Now go back to step (b) with 52 *, replacing s\*. Repeat this 
procedure until sampling is stopped and a final sample is obtained. 

2.4 Inclusion Probabilities of the Units 

The inclusion of an element or a unit i in a sample V is a random event 
indicated by a random variable /, = /,(s) defined as 

f 1 if / g s 
[ 0 otherwise 

A is called the sample membership indicator of element/. 

Define 7t, as the first order inclusion probability of the / th unit in U to be 
included in the sample s under the sampling designZ)(iS, P). Then, 

7t, = P(/e *) = />(/,= !) = £/,(*), 

S3i 

where summation is over alb e S containing the /th unit. 

[Note: £(/,) = nr, V(I,) = 7t,(l- n,)\ 0 < n, < 1] 

Similarly we may define n, t as the second order inclusion probability of 
including /th andy'th units in the samples under the sampling design/). Then, 

n,j = P(i and j e s) 

= />(/,= 1 ,/,= !)= Xp(s) 

S3 /\ 7 

Note: £(////) = 7t/,; n u = n,, and it, + n f - 1 < n u < min(Tt ( , nj), (/ *y). 

In general the Mi order inclusion probability may be written as 
71 ■= 

Consider a simple random sampling without replacement scheme. Any 
ordered sequence of n selected elements has the probability 

p(s*)= l/M") = (N-ri)\l N\. 

A specified set (s) having aforesaid n elements can be obtained in n\ different 
sequences. Hence, 
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n\{N-n)\_ 

1 

N ! 



^ n) 


P(s) 


In this design 

7ti - P(/j = 1) = Y, P(s)= 


S3 I 


'N-t 

n-L 


\nJ 


n 

N' 


because in 


'n- r 

n-\. 


samples s includes element i, and 


71 a =P{li=U j =l)= £ p(s) 

S3 l\ j 

(AA n (n - 1) 


because in 


'N-1 
\n- 2> 

N - 2\ 
n - 2> 


\n) 


N(N- 1) 


samples s includes i and j. 


2.5 Relation between Sample Size and Inclusion Probability 

As mentioned earlier, for a sampling design/) with fixed sample size («), n(s) = 
n for all 5. If the sampling design has variable sample size,«(j) is a random 
variable having 

£W$)]=Z"( 5 )p( j )“ v » sa y 

5 

Now. £ji, p(s) 

i I S3 i seSies 

= Z PW Z 1 = Z />(*)«(*)=£[«(*)] = V 


11*0 = 11 Z pw = Z Z pw 


I* / 




\S3 ij 


s eS\i* j es J 


~Z**> Z i =Z w ^)(^)- 1 )^) =v ( v - 1 ) + ^))- 

7 * j es s 

For fixed sample size design: 



*/ -n ^ p(s)=n ,as n(s ) = n for all s e S, 

s eS 
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(ii) 2Z 71 '/ = = 

i* j seS 

(iii) =(«-l)^p(5)=(«-l)7i / . 


2.6 Parameter, Data and Estimator 

Lety be a real valued variate with values Yu Y 2 , ...,Y N for the units labelled 1 
2,../V respectively. 

The vector Y = (Y\, Y 2 , .. Y N )', is referred to as the parameter ($) 0 f the 
finite population. The totality of vector points {Y} = {(Ti,.. Y N )'}, a l <Y l <b 
a x , b, are known or unknown real numbers,/ = 1,2,.. N, generate a parametric 
space Q which is a proper subset of N-dimensional Euclidean space R N . 

In a sample survey one is required to draw inferences on some well 
defined parametric function of (/,, V 2 ,..., f N ), say 0(Y) which is assumed to 

be real valued, that is, a point on R 1 , after observing y values associated with 
the units in the samples. 

The parametric functions of interest in sample surveys are usually 
population total Y= I Y h population mean = (l/N)ZY h population variances 2 
- (1 /N)Z( Y, - Y ) 2 , coefficient of variation C y = o>/T, etc. * 

Once a sample s* or s has been selected, next problem is to observe the 
variate values>> associated with the sampled units. These values are catfed data. 
Thus, we represent labelled data as d * in the form of an ordered sequence 

d* = ((i, Y it ),i t gs*)= (s*, >0,1 < t <n(s*). 

If labels are omitted, un label led data are represented by unordered set of 
pairs d represented by 

d = ((iJi);ii es) = (s, y) 

Any d* can be summarized to d by obtaining and corresponding to s*. The 
totality of all possible data points constitute a data space. 

V* = {d*} or ® = {d}. 

Definition 2.6: Given an unordered data point d, we define a statistic t{d) as a 
real valued function of d, and is given by t(d) = t (s,y), defined on (S x R N \ that 

,S ’A* 5 ’ ls a ma PPjng from (S x R") to R ] and for each ^ e S depends on Yi s for 
which i e s. 

The statistic t(d) is called an estimator when it represents a relevant 
U !J C j 0n anc * ^ or a rea ^ z ®d sampler and the numerical data points, 
samnlp c 7 ed / n estimate • Thus > the estimator t(d) or simply t is based on a 
makine inf/^ 6 accor ^ n g t0 sampling design/? and hence for the purpose of 
g rences our interest centres around both/? and /, i.e. a combination 
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(p, /), called a sampling strategy or simply a strategy. 

2.7 Basic Properties of Estimators 

Let>> be the study variable and 0 be the parametric function of Y\, Yi, .. Yn , 
given by 0 = 0( Y\, Yi,..Y n). Let /be an estimator of 0 based on a given sample 
s selected through a sampling design/?. It is then possible to calculate possible 
values of t based on different possible samples s in S, giving rise to a sampling 
distribution ofr. Two important characteristics of sampling distributions which 
are given prime importance in survey sampling literature for comparison of 
estimators are the expected value of the estimator over possible samples.? and 
the variance of the estimator or in other words variance of the sampling 
distribution of the estimator. Thus, we define the expected value and variance 
of t as follows. 

Definition 2.7: As estimator is said to be p-unbiased or design unbiased if 
E p {t)= Y P(s)t(s,y)=Q(Y ] J 2 ,...,Y n ) for all Y e R N . 

s eS 

When E r (t 0 , define B p (t ) = E p (t ) - 0 called the p-bias or design bias of /. For 
an unbiased estimator £,,(/) = 0. 

Definition 2.8: The p-variance or design variance of an estimator/ is defined as 
MSE p (t)= E p (t -Q) 2 = Y P(s)(t(s,y)-E(t(s,y)) 2 . 

s eS 

When t is a biased estimator, the p-mean square error or /?-MSE is defined as 

MSE„(o=M;-e) 2 = £[«*,.y)-9)] 2 p(s) 

S E S 

Definition 2.9: A strategy (p , t) is said to be p-unbiased if t is /?-unbiased under 
the design p. The ^-variance and /?-MSE of a strategy (p, t) are respectively 
p-variance and/?-MSE of the estimator/, when used with sampling design/?. 

Very often an estimator is such that although it is biased, its bias is 
negligible for large sample size. For such a situation we introduce the concept 
of consistency. An estimator t is said to be consistent for estimating the 
parameter 0 if P (| t - 0 | > 6) tends to zero as the sample size n tends to infinity. 
However, this definition of consistency is based on a concept that samples are 
drawn from infinite populations. A definition in consonance with the concept of 
finite population may be given as follows: 

Definition 2.10: An estimator t is said to be consistent ford (Ti, Y 2 , .. Yn), i ft 
= 0, when n(s) = N. 
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n 1 11 * Given two estimators/] and h of a parameter 0, t\ is said t nL 
better than ft, iif MSE (ft) < MSE (ft) for all Ye R N with strict inequality f 0rat 

least one set (Yu Yi, • • ■> W- 

Simple comparison of strategies can be made with the help of estimators 
involved in the concerned strategies. 

Definition 2.12: A strategy (pu*\) is said to be better than the strategy fa 
, 2) if MSE (/?,, t\ ) < MSE (p \, ti) for all Ye R N with strict inequality holding out 

at least for one Y e R N . 


2.8 Concepts of Sufficiency and Rao-Blackwellization 

As in case of classical statistical inference, the sample survey theory calls for 
search of estimators among the class of estimators which are functions of 
sufficient statistic, because of some favourable properties to be known later on. 
The concepts of sufficiency and Rao-Blackwellization were first introduced by 
Basu (1958) and subsequently developed by Basu and Ghosh (1967), Pathak 
(1964) and Basu (1969). 

Definition 2.13: A statistic T=f[d) is said to be sufficient for the parameter^, 
f 2 , . . Yn) if and only if the conditional distribution of d given T = 7o is 
independent ofY. 

By invoking Rao-Blackwellization to survey sampling it may be stated 
that for any estimator/* = /* (< d *) of0 (Tj, Yi, ..., Tn) which depends onaf*, we 
may find a better estimator / = t(d) depending on d. 

Define / = £(/* | d). 


Now, £(/*| d) = E7 (d*) | p (s*y YJp (j*)], 
where summation E' is over all samples s* equivalent to s. Hence, 

p{s*) 


*(')«! 


I 't*(d*) 


I' P(s*) 


P(s) 


where, p(s) = Z'p(s*) 

= Zt*(d*)p(s*)=E(t*) 

Further, E(t*t) = E[E{(t*t | d)}] = E(t 2 ). 

Now, E(t* - 1 )2 = E(t * 2 ) - E(l 2 ). 

As LHS > 0, we have £(/«) > E(t\ showing thereby V(t) < V(t*)- 
thp c y ls constructed by Rao-Blackwellization from t* after conditioning on 

liLoe demir !, C * ^ is . free from multiplicity, it * 

t* depending 6 ^ mult 'P 1 ‘ c ' t y in the sample. Thus, for any estimatoi 

independent of nrH 0 ^^ and . multi plicity we can find another estimate / 
F er and multiplicity which is better than/*. But in fact there is 
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an infinity of such estimators as the sufficient statistic^/ is not complete. Lack of 
completeness makes our search for unique best estimator inconclusive. 
However, if at all one such exists it must belong to a class of estimators ignoring 
order and multiplicity. 

2.9 Likelihood Function 

In traditional statistical inference the likelihood function has played an 
important role in the estimation of parameters. However, in sample survey 
theory the likelihood lacks unique maximum as shown by Godambe (1966). 

Definition 2.14: Given data d= {(i,Yi), i e s} the likelihood function L d (Y) is a 
function of parameter Y, which for any Y e R N gives the probability of 
obtaining d if Y were the true parameter value. 

Thus, for any design/?, the likelihood function Ld (Y) is given by 

L„(Y)=\ P(S) f ° rallYen '' 

[ 0 otherwise 

where Q</ is a set of parameter points (Y\, Y 2 , . . ., Yn) that is consistent with 
sample s. It may be seen that the likelihood function does not have a unique 
maximum and is uninformative because all possible unobserved components of 
Y in Qd have the same likelihood. Two controversial issues arise, namely 
inference from survey data should be independent of the sampling design and 
information from data is carried out by the likelihood function. These views are 
not acceptable to statisticians because choice of an estimator depends on 
calculation of MSE under a given sampling design and further the likelihood 
function through fixed population approach to survey sampling does not admit 
discrimination among the possible values of the unobserved components ofY 
since all values Y e Qj have the same likelihood. 

Commenting on the uninformative nature of the likelihood in sample 
surveys, C.R. Rao (1977) states that the uninformative nature of the likelihood 
can not be attributed to some measure theoretic difficulties arising out of 
degenerate distributions etc, but is inherent in the problem itself of first 
choosing some units and making observations on them. Realizing that the 
likelihood alone is unable to provide our answer to estimate a parametric 
function in sampling from a finite population, Basu (1969) advocates the use of 
prior information on the unknown and application of Bayesian techniques. 

Royall (1968), Hartley and Rao (1968,1969) and C.R. Rao (1971) discus¬ 
sed likelihood function in a different way after ignoring label part of the data 
being deliberately ignored once the sample has been drawn. 
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2.10 Uniformly Minimum Variance Unbiased Estimator 

Given a class of unbiased estimators A u the question arises as to whether there 
exists an estimator having minimum variance for allY e R N . Such an estimator 
is called uniformly minimum variance unbiased estimator. 

Definition 2.15: An estimator to is said to be uniformly minimum variance 
unbiased estimator in a class A for a parametric function under a given design 

P if 

V{to) < V(t) 

for all / e A and for all Y 8 R N . 

Godambe (1955) was the first to point out the non-existence ofUMV 
estimator in a class of linear homogeneous unbiased estimators of the 
population mean. Godambe and Joshi (1965) proved the non-existence of 
UMV estimator in the class of all unbiased estimators. However, UMV 
estimators do exist in a very much restricted class of estimators under certain 
designs. More about this will be discussed in a later chapter. 

Absence of uniformly minimum variance property with the estimators has 
encouraged statisticians to look for other optimal properties such as 
admissibility, minimaxity, etc. 

2.11 Admissibility 

The concept of admissibility of an estimator first considered by Godambe 
(1960) and Roy and Chakravorty (1960) is not as powerful concept as the 
uniformly minimum variance (UMV). Its necessity and importance arise out of 
the non-existence of uniformly minimum variance unbiased estimator within 
some general classes of estimators. Therefore, search is restricted to a class of 
estimators in which no estimator is uniformly better than the admissible 
estimator. However, the criterion of admissibility does not single out one 
unique estimator and many traditional estimators in survey sampling have been 
shown to be admissible. 

Definition 2.16: An estimator / = t(d) belonging to some class (E of estimators is 
said to be admissible in £ under a design p, if and only if no estimator in € is 
better than t(d). 

An estimator /* depending on order and multiplicity of sampled units is 
inadmissible because by the application ofRao-BIackwellization / = E(t* I ^ 
is better than /*. 

The concept of admissibility can also be extended to the case of choice of 
strategies (Joshi, 1965, 1966). 

Definition 2.17: A strategy (p , /) belonging to class Tfis said to be admissible in 
7f if and only if there exists no strategy in 9P better than (p , /). 
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2.12 Hyperadmissibility 

Hanurav (1966, 1968) put forward the concept of hyperadmissibility by 
introducing a special type of subset of parameter spaced, called 'principal 
hypersurface’. 

The principal hypersurface Q (p, b ,..., in) is defined as the set of all Y e 
fl^such that Y’ u = 0 for /* /'i, h ,.. //, and -oo < Y, < +oo for / =/'i, h ,..//,. It is a 

subset of where certain prescribed co-ordinates are held equal to zero. The 
zero vector 0 = ( 0 , 0 ,. . 0 ) is contained in each principal hypersurface. 

Definition 2.18: An estimator/, belonging to some class <Fof estimators is said 
to be hyperadmissiblein <F under a given design/? if and only if foranyQ(/'i, n ,. 

.., /'/,) in RP, t is admissible in <F under a restricted parameter space Q(/'i, h ,.. 

//,). 

Thus, the hyperadmissibility implies admissibility in all possible 
sub-populations. 

2.13 Minimaxity 

As known earlier the performance of a strategy (p, /) is determined by its MSE 
(/?, t) which is defined on the parameter space Q, the set of all Y. Now define 

R(p,t) = sup MSE (/?,/) 

Y 6 ft 

Let us suppose R(p, t ) is finite for some strategies (p, i) of a class <F fixed 
in advance. The principle of minimaxity suggests to look for a strategy which 
minimizes R (p, t ). Then 

;-(/?,/) = inf R(p,t)= inf sup MSE(p,/) < oc 

(p. I) 6 • (pj)e Yefl 

is the minimum of the maximum value (minimax value) with respect toQ and 
T. The strategy (p' , t') e <F is called the minimax strategy if R{p\ t') = r(p , /). 
Agarwal (1959), Godambe and Joshi (1965), Royall (1970), Joshi (1965, 1966, 
1968) and Scott and Smith (1975) are considered as pioneers in studying 
minimax property of the estimators of finite population mean. 


.14 Invariance Property 


Godambe (1968), Godambe and Thompson (1971) and Basu (1971) considered 
invariance criteria in connection with the estimation of finite population mean. 

The estimator t(d) is said to be origin invariant ifY' = (Yi.Tv) changed 

t° Y = (K| + a ,.. Yn + a) makes t(d') = a + t(d) for all d. The estimator t{d) is 

said to be scale invariant ifY' = {Y\ _ Y N ) changed to Y' = (bY\, . . ., bYs) 

makes t(d') = /?/(</), f or all d, b > 0. The estimator t{d) is said to be linear 
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invariant if Y' = (Ki, .. Yn) changed to Y' ={a + bY\,.. a + bYd) with b > 0 
makes t(d') = a + bt(d) for all d. 

2.15 Super Population Models 

Inferential problems in finite population sampling may be attacked from two 
different approaches - fixed population approach and super population 
approach. In the fixed population approach or in other words under classical 
theory of sampling from finite population, the statistical inference is 
characterized by the use of probability distribution generated by the random 
sampling planp defined on a collection of subsets {s} of units labelled 1,2,.. 

N with/?(s) as the probability of selection of the subset of sampling units whose 
labels are ins. The survey variable^ assumes a fixed set of values Y\, X?, .. Yn 
attached to the units 1,2,..., /V and the interest centres around estimating some 
population parametric functions 0(Xi, . . Yd) such as population mean, total, 
variance, etc. The supporters of this approach emphasize that the correctness of 
the inference should only be used on the probabilistic structure of the sampling 
process to judge the correctness of results derived from the sample. As such the 
standard error which is a measure of precision of estimates of parametric 
function is defined with respect to/?(.s) only, where Y\, Y 2 ,. .., Yn are treated as 
constants. 

As should be evident in later chapters, the fixed population approach most 
often does not lead to optimal strategies. Further, comparison of two different 
strategies, sometimes, does not enable one to get any definite conclusion. This 
has led the research workers to take the help of super population models where 
the randomness ofy-values of the population is assumed. To exemplify the 
concept of super population model in a real life situation, suppose the survey 
scientist is interested to estimate average yield of rice in a particular geographic 
area, where the units are agricultural fields. As the information is wanted for 
future planning, it would be quite unrealistic to estimate the average yield based 
on a single year. If one can introduce a stochastic nature into survey variable the 
yields of N fields (Xi, X 2 , . . Yn) are single realization of random variables 
(Xi*, Yi *,..., JV*) , having a joint distribution^ (Xi*, ..., X^*) which may be 
either completely specified with all desired features or partially with specified 
mean and variance. 

A super population model £ involves a priori knowledge of the nature of 
the survey variable in the current situation. It is a mathematical model 
formalizing certain important features of real world situation surrounding the 
problem under investigation and clarifies our understanding about inferential 
issues. It may be mentioned here that no mathematical model can exactly 
reproduce the real life situation and it is simply an approximation expected to 
be more or less realistic. Early users who made use of the concept areCochran 
(1939, 1946), Demingand Stephan (1941) and Madow and Madow (1944). 
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To estimate the population mean Y = I Y,/N in the super population model 
(m) approach, (Ti, K 2 ,.. Yn) is regarded as an outcome of (Ti* ^> 2 *, • ■ •. W 
with distribution $ and consequently Y as an outcome of Y* - ZYf/N. The 
available data are 

d={{i,Yi*):iesh 

where s e S , observed outcome of the random variable {(/, Y ,*): / e s}. 

Define a statistic /, which for any given values depends on Y 1 *, ..Tn* 
only through those Y,* for which i e s. 

When / is used for making inference about the population mean, t is called 
a predictor of the random variable Y *. Thus, E p (t) and MSEp(/) are functions of 
the random variables Ti*, . . ., Yn*. t may be either p-unbiased (design 
unbiased) or m-unbiased (model unbiased) or both pm-unbiased (both design 
and model unbiased) predictor. 

Definition 2.19: / is said to be p-unbiased or design unbiased predictor for T*, if 
and only if for a given designp, 

E p {t) = Y for all Y' = (/,,..., Y N )sR N , 

where t is based on Y,* = Yj, i e s. 

The strategy (p, t) is p-unbiased if ns a p-unbiased predictor underp. 

Definition 2.20: t is said to be m-unbiased predictor of Y* iff for given E,„{t - 
Y*) = 0, for all s e S, for which p(s ) > 0. 

Definition 2^1: ns said to be a pm-unbiased predictor of Y* iff for given p and 
E n ,E n (/ - Y*) = 0, identically in the parameters of the model. 

Definition 2.22: t is called a m-unbiased estimator of 

£,„(V*)=f] —=M 
f^N 

|A/ being E m (Y,*) iff t is a m-unbiased predictor of Y*. 

Similarly t is called a pm-unbiased estimator of p if ms a pm-unbiased 
predictor of Y*. 

The choice of a strategy (p, t ) will depend on m-expected p-MSE. If Y* is 
our principal interest for prediction, the reasonable criterion to choose among 
strategies should be MSE P (p,t) = £, (, _ y*?, which is the sampling 

mean square error (MSE) expected under the model called the average 
(anticipated) mean square error (AMSE), which takes into account randomness 

both with respect to conceptual superpopulation and the sampling design. Thus, 
we write 

AMSE (p,/) = E„ E p (t - Y*) 2 = EpE m (t - Y*) 2 , 

= Ep V,„ (/) + Ep [E m (t - Y *)] 2 + V n , (Y*) 
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- 2 E,„ [{Y* -E„,(Y*)} E P {t - 
assuming a non-informative/?. If t is design unbiased, 

AMSE(/?, /) = E„,V p (t) = E p [V„, (/) + B n r (0] - K„(Y*), 
where £,„(/) = E,„(t - Y*) and V,„(t ) = E m [ t - E m (t)] 2 . 

If / is both design and model unbiased, 

AMSE (/?, t ) = E ln V p (t) = E p V m {t) - V,„ (?*). 

Further, if p = £,„(?*) is the target for inference, the natural criterion 
which emerges is E,„ (/ - p) 2 . In classical survey sampling literature based on 
design based approach, averaging ofE p {t- Y) 2 over super population £ is often 
taken recourse to simplify the comparison of two/?-unbiased strategies. The 
adherents of model based approach feel that the requirement of/?-unbiasedness 
should not be overemphasized compared to m-unbiasedness and 
/?/zz-unbiasedness, and what matters after all for inference is the ^-distribution 
with 5 treated as_given. Thus, it seems desirable to look for / for given j to 
minimize £,„(/- Y*) 2 . It sometimes turns out that the predictor/that minimizes 
E„i (/ - Y*) 2 also minimizes E n ,E p (t - Y*) 2 for any non-informative design/?, 
suggesting thereby that in model based approach importance should be attached 
to finding out a predictor / which is good for any sample s actually observed 
rather than attempting to find a good strategy (/?, /). 

Definition 2.23: Given two predictors /] and /2 for any given design/?, /] is said 
to be better than h if E m MSE/, (/?, /i) < E m MSE/?(/?, / 2 ) for all ^ belonging to £, a 
class of super populations. 

A strategy (/?,, /,) is better than (/? 2 , / 2 ) if E m MSE^(/?i, /,) < E„, MSE^ (/? 2 , 
/:), for all £ e <E. 

Types of Super Population Models 

Frequently mentioned super population models ^ in the survey sampling 
literature are (i) Transformation model, (ii) Exchangeable model, (iii) Regres¬ 
sion model, (iv) Random permutation model. 

In the Transformation Model (TM) the transformed variable defined by 
Z/* = (Y* — bi)!a h 1 < i < N have the common means p, variance a 2 and 
covariance pa 2 for any pair /' ^ j with unspecified joint distribution, p, a 2 and p 
are usually unknown with — 1/(7V — 1) < p < 1. 

Assumptions : 

i) a, > 0 with YLa, - /V, -qo < b, < +oo, i = N. 

>0 p/ = E in (Yi*) = b, + a, p, a, 2 = V,„(Y,*) = a, 2 c> 2 , 
oij = co v /;/ {Y\*, Yf) = a, a, a 2 p, (z * j ). 

The basis of the transformation model rests in choosing the a, and b, 
suitably so as to standardize the first two moments of the transformed T/*. 



Basic concepts in survey sampling | 23 

In the Exchangeable Model (EM) the Z,*’s (/ = 1, . . TV) are 
exchangeable, that is, their joint distribution is same for all A^! permutations of 
(Z| *, Z 2 *,..Z N *). Clearly Z,* ’s have same means, variances and covariances. 

It may be observed that exchangeable models form a subclass of those under 
transformation model. The concept of exchangeability in the finite population 
inference has been extensively used in Bayesian context byEricson (1969) 
under the name exchangeable prior. 

For the Regression Model (RM) the values Ai, A 2 ,.. ., An of a positively 
valued auxiliary variable x associated with the study variable y through a 

i egression relation are assumed to be fixed and known in advance. Further, it is 
assumed that 

£,„(/,* | X, = x,) =f[ Xi ) 

Kn(Y,* I Xi = Xj) = g 2 v(x,) 

cov„, (Y,*, Y* I X, = x „Xj = xj) = 0, 1 <i*j<N. 

In the linear regression model case/*,) = a + (3x,, 1 < / < N. The structure 
generally assumed for v(x,) is v(x,) = xZ, g > 0 and is known in advance either 
from previous information or from experimental investigation .g = 0 implies 
that the errors are homoscedastic. 

In Random Permutation Model (RPM) the population units are not 
labelled for the purpose of identification. Instead the values of the study 
variable y are an order set a\ < a 2 < . . . < a N , where a ?s are unknown fixed 
quantities. Now, suppose the units are labelled as 1,2,..., TV at random. For any 
permutation (ij, h ,. . ., i N ) of the labels />(Fi* = a, h K?* Yn*= a ) = 

I//V! ' ' v 

Thus, although a i, a 2 ,. . ., on are fixed, the random labelling of the units 
makes the values attached to the units of the population as values assumed by 
random variables (Y i*, F 2 *, -. Y N *) having joint distribution as stated before. 
For such a random permutation model 

Em(Y*) — a, Vm(Yi,*) — ct w 2 , C 0 V/j 7 ( Y *, Y*) — —G a -/(N — 1), / ^ j, 

where d = E a,l N and ct w 2 = E(a, - d) 2 /N. 

In connection with finite population inference it is observed that under the 
assumption of a super population model, the sampling strategies usually 
adopted in practice are in some sense best having some desirable properties, 
thus justifying their uses in sample surveys. 

Prediction Approach 

In prediction approach, as different from a classical one, the super population 
model ^ plays an essential role in finite population inference. For example, if 
we are interested in estimating finite population total Y , then Y can be written as 

Y = observed total of sample values + total of values of 
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non-observed units = ^ Y { + ^ Y { 

ies ies 

The problem of estimating T, therefore, logically boils down to problem of 
predicting the unobserved part Z Y/. This is achieved by exploring a link 

I es 

between the observed and the unobserved part through the super-population 
model That is, from/s in 5 , we infer aboutwhich is then used to predict the 
values of y’s not in s. The theory of prediction in the finite population set up has 
been tackled by different authors following different approaches. In this 
connection mention may be made about least square prediction, fiducial 
prediction (Kalbfleish and Sprott, 1969) and Bayesian prediction (Ericson, 
1969) techniques. 

2.16 Robustness 

The non-existence of minimum variance unbiased estimate for any given 
sampling design (Godambe, 1955) led survey scientists to look for alternative 
optimality criteria (Godambe, 1955; Royall, 1970, 1971; Royall and Herson 
(1973a, b) to arrive at a reasonable inference. An estimator which is not 
sensitive to moderate departures from underlying assumptions is said to be a 
robust estimator. The study of robustness in finite population context gets 
simplified by the use of super population models, which assumes one’s prior 
knowledge or belief about the population structure, consequently developing 
concepts like optimality and robustness for the choices of sampling strategies. 

Any sampling strategy which is best under a specific model should be the 
right choice if the real world situation is close to assumptions under the model. 
As a consequence of this, the super population approach may be disastrous if 
the assumed super population model fails. This has given rise to attempts to 
make super population model as general as possible incorporating wide range 
of conditions under which it performs well and to robustness as a desirable 
property of an estimator. Robustness considerations investigate how much 
efficiency of the proposed sampling strategy is lost under departures from the 
assumed model and suggest alternative robust strategies which are efficient 
under broader class of models. If a strategy is sensitive to errors of 
m ^specification in the model, it is said to be poor from the robustness point of 
view. A robust estimator performs well under broad conditions allowing 
uncertainty which is a part of real world situation. 

Godambe (1955), Godambe and Joshi (1965) and Godambe (1966) 
defined optimality criteria to select an unbiased estimator which has minimum 
expected variance under a prior distribution on the population. If such an 
estimator remains optimal under a wide class £ of prior distribution, the 
criterion is said to be robust. As the optimal estimator may be found to be 
unique but not the sampling design, the optimality in estimation can be made 
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further robust to some departures from model by (5 choosing a suitable 
sampling design from the class of optimal designs 'll However, strict design 
unbiasedness is too restrictive an assumption to demand from an estimator. 
This has led Brewer (1979) to float the concept of Asymptotically Design 
Unbiasedness (ADU). The asymptotic approach demands that the strategies 
derived from the assumed super population models should hold the desirable 
design properties such as unbiasedness and consistency asymptotically. In this 
manner some protection against model failure is achieved under safe sampling 
design grounds. In Brewer’s (1979) asymptotic approach the finite population 
of N units, as usual, is assumed to be a random sample from infinite super 
population. A class of estimators o: more appropriately predictors is adopted 
under a model. The condition of asymptotic design unbiasedness is then 
superimposed on the predictor. Next, probabilities of selection are chosen to 
minimise the design expectation of the model based mean square error of the 
predictor. 

Fuller and Isaki (1981) suggested an alternative framework where they 
visualized a nested sequence of populations Uk of size Nk (0 < N\ < N 2 ,. .. ) and 
a sequence of samples s* consisting of nk distinct units (nk < Nk). 

Robustness can also be studied from predictive approach. Here, the main 
arguments are that the sample s is the only relevant data at hand and the 
sampling design plays no role in inference. These arguments lead to concept of 
balanced samples and invoked principles such as likelihood principle 
(Birnbaum, 1962) and flat likelihood function (Godambe, 1966) and 
conditionality principle (Cox, 1958). 

Godambe and Thompson (1977) considered robust estimation in finite 
populations and established that for estimating the finite population total the 
estimators which are optimal under super population models can be made 
insensitive to certain departures from the models by the choice of balanced 
samples and also by stratification on a size variable [Royall, 1970 a and 1971- 
Royall and Herson (1973a, b)]. 

2.17 Conditionality 

In sample surveys specially on human populations the number of units required 
to be observed may not always be the same because of deaths, refusals, and not 
at homes, making thereby the sample size a random variable. Now the question 
arises whether to make the inference conditional on the sample size actually 
observed or making inference dependent on the random mechanism producing 
such a sample. The controversy between the conditional and unconditional is a 
classical one and dates back to Fisherian era (Fisher, 1935) and was taken up 
subsequently by Cox (1958), Birnbaum (1962) and Bartholomew (1967) 
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Durbin (1968) and Swain (1969) studied the inferential consequences 0 f 
randomness of sample size in large scale sample surveys in testing parametric 
hypotheses. 

Birnbaum (1962), while supporting the conditional frames of reference 
proposed an axiom concerning evidential meaning to state the principle of 
conditionality. “If an experiment^ is (Mathematically equivalent to) a mixture 
G of components {Eh} with possible outcomes (Eh, Xh), then EV(E(E h , */,)) = 
EV {Ei h xh). That is, the evidential meaning of any outcome (E h , x h ) of any 
experiment E having a mixture structure is the same as the evidential meaning 
of the corresponding outcome*/, of the corresponding component experiment 
Eh ignoring otherwise, the overall structure of the original experiment E” 

Random sample size whose probability distribution is known and does not 
depend on parameter in question forms a part of the minimal sufficient statistic 
in classical statistical inference is an ancillary statistic in the words of Fisher 
(1935). If such an ancillary exists, the minimal sufficient is not complete and in 
such a case Fisher advocated the choice of reference set with the help of an 
ancillary for inferential purposes. However, there is no general way of finding 
an ancillary statistic when the minimal sufficient statistic is not complete. It is 
also not yet known whether the incompleteness of the minimal sufficient 
statistic will always lead to existence of an ancillary. Basu (1964) put forward 
an example to show that the ancillary statistic may not be also unique. As 
observed by J. N. K. Rao (1985) in conventional set up for inference in survey 
sampling the sampling design defines a sample spaceS-sets of all subsets {s} 
and the associated probabilities of selection p{s ). The concepts of consistency, 
unbiasedness and mean square errors are generally discussed in an 
unconditional context with respect toS. However, the relevant reference set for 
inference after the sample s has been observed may not be S, if the sample 
contains recognisable subsets. 

2.18 Different Classes of Estimators 

Linear Estimators 

In sample surveys attention of survey practitioner generallycentres around the 
estimation of population total or mean. The general estimators employed for the 
purpose are usually linear estimators which are linear functions of 
observations. Neyman (1934) preferred linear estimators because the 
distributions of linear estimators follow closely normal law and by applying 
Markoffs method one can find a best linear estimator. Keeping in view the 
peculiarities of sampling without replacement from a given finite population 
Horvitz and Thompson (1952) proposed three classes of estimators for the 
population total and showed how the identifiability of units is instrumental in 
ating different classes of linear estimators. The classes proposed by Horvitz 
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and Thompson are: 

n 

T \ =2>W>V>' 

1 = 1 

where y(i) (/ = 1 , 2 ,... n) represents the outcome of the /th draw and oq,) (/= 1, 

2., n) is a weight attached to the element turned up at theith draw. T\ takes 

into account the order of selection of the elements in the sample. 

Ti =XP/J'i 

i es 

where (3, (/ = 1,2,... N) is the weight attached to the>»-value of /th unit of the 
population whenever it is included in the sample s and the summation is over all 
units in the samples. 7? takes into account the presence or absence of any given 
population element in the sample. 

T 3 = A s* X y (0 
i 6 s* 

where A s - is a scalar defined for all possible ordered samples s* e S* and Y(,) is 
^-value of the unit drawn at/th draw, (/= 1,2,.../?). Koop (1963) and Prabhu 
Ajgaonkar (1962) proposed another class of estimators 

t a 

r =1 

where (3,* (/* = 1,2 ,... n, X = 1,2, —, AO is a weight attached to Xth unit of the 
population whenever it is selected on the rth draw. Thus, T 4 class of linear 
estimators is defined by N x n weights and includes both T\ and 7?. 

Koop (1963) put forward three axioms which are three features inherent in 
the nature of actual process of sample selection according to any probability 
scheme resulting in three different types of sample formation. The axioms are: 
(i) The elements appear in order, (ii) A given element is either present or absent 
in the sample, (iii) The sample itself is one of the set of all possible particular or 
distinct samples. 

Considering inadequacy ofHorvitz and Thompson’s classes of estimators 
to cover all linear estimators Godambe (1955) proposed a most general 
homogeneous linear estimator of the form 

^5 = 

X e s 

where is defined in advance for all logically possibles* which in all areA n 
and for X e s*. T\, T 2 and Ti are particular cases of T 5 . 

Hanurav (1966) defined a general class of estimators as 
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where a and p do not depend on/s and sum is over all distinct units belonging 
to the sample. 

Quadratic Estimators 

Importance of quadratic estimators arises from the fact that the sampling 
variance of a linear estimator and also the finite population variance are 
quadratic functions ot the form 

i = 1 i * j 

where a,, (ij = 1, 2,..AO are constants independent of the/s. 

Hanurav (1966) formulated a general class of homogeneous quadratic 

estimator (g.h.q.e) 

<7/, = X Y l + X y ^' 

X e s X + X' es 

which may be equivalently expressed in the form Z b( s > i,j)yiyj, and general 
quadratic estimator (g.q.e) as 

<72* =a, + XP^ r >s + Xl'rxxJ'x + X I'sH'J'x/r 

X e s X e s X * X' e s 

2.19 Confidence Interval 

Sometimes we may think of estimating the parametric function ofy values 0(Y) 
by a confidence interval based on an unbiased estimator t calculated from 
random samples of size n drawn from a finite population consisting ofN units 
with references to sampling design p(s). The confidence interval 

C{s) = (//. (s), tu (s)) 

is a random interval withr/Xs) and tiis) as the lower and upper confidence limits 
of the interval such thatl/XO < ti{s) for every s. 

To make an assertion that 0 lies inside the random interval C(s) one 
naturally is required to make a probability statement about the coverage of0. 
This coverage probability is called confidence coefficient or confidence level , 
say (1 - a) nearing to unity, where a is a small risk for non-coverage. 

A method of finding out the confidence interval is to find out the sampling 
distribution of (t- 0)/Vv(/) in repeated samples (s) drawn with given sampling 
design p(s) and thereby calculating the limits tfs) and tu( s) to cover 0 with 
probability (1 -a). However, in sample survey situation it is difficult to find out 
a procedure to make an exact confidence statement and hence approximation is 
usually involved. The setting of confidence interval is simplified if die 
condition for the application of central limit'theorem is satisfied for the 
random variable t, thus facilitating the'normal theory approximation. Thus, 
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100( 1 - a)% confidence interval for 9 is computed as C(s ) - (t — Zain 
-a /2 V f(0) approximatiely. If F(r) is not known which is usually so, a consistent 
estimator of V(t), say V(t) may be utilised to set the confidence interval. Hence, 
an approximate confidence interval with confidence level (1 -a) is computed 
as t ± - a i 2 V(t), where Zaji is a constant exceeded with probability a/2 by N(0, 1) 
random variable or more appropriately by Student’s/-variable with v degrees 
of freedom. Traditional statistical theory shows that under certain conditions 
certain statistics in case of sampling with replacement from a finite population 
or from infinite population follows approximately normal law. But in case of 
sampling without replacement the problem gets involved because the condition 
of independence is violated. Under simple random sampling without 
replacement, Madow (1948), Erodos and Renyi (1959), Hajek (1960) have 
shown that subject to certain conditions the sample mean is approximately 
normally distributed. Nandi and Sen (1963) dealt with limiting distribution of 
^/-statistics, which was shown to be approximately normal under certain 
conditions when simple random sampling is carried out without replacement. 

In sample survey theory it is sometimes difficult or even may not be 
possible to find exactly unbiased estimators of some useful and meaningful 
parametric functions and under such circumstances one often gets contented 
with approximately unbiased estimators. A large component of bias inflates the 
mean square error of an estimator and consequently one regards such estimator 
very poor. For a valid confidence interval in addition to small mean square 
error, one should be reasonably assured that the bias is small compared to the 
standard error ^V(t). 

To show the effect of bias on the confidence interval let us define 
RB = B(t)NV(t), where B(t ) is the bias in t defined by B(t)=E(t) - 9. 

Assume that z = (t - 9 )NV(t) is normal or approximately normal. 
Construct the confidence interval C(s) as 

C(S) = (r - Z „/2 N m, t + z /«2 V V(t)) 

If this interval is to cover the true value 9, then the coverage probability 

Pq= P (t- Z a/2 'IV(t) <Q<t + ZaJl VR(0) 

Hence Pq can be calculated for each value of B(t)NV(t) using table of the 
normal distribution. 

If t is unbiased, Pq = 1 - a, 

As the relative bias (RB) increases, Pq decreases. Cochran (1963) has 
shown that the decrease in confidence coefficient Pq is only marginal for |/?£| 
less than 0.20 or more conservatively less than 0.10. For values of|/?£| more 
than 0.20, the decrease in Pq becomes marked. Thus, for small values of|/£5|, 
the confidence statement on 9 can be made as if no bias were present. 
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3.1 Introduction 


Simple random sampling is an equal probability sampling design and is the 
simplest of all probability sampling schemes where every sample point in the 
sample space has equal probability or in other words every samples of same 
siz c(n) has the same chance of being selected. In this method of sampling the 
population is supposed to consist of N distinct and identifiable set of units 
numbered as U\, U 2 ,... U N and the easiest method to draw a samples of fixed 
size n is to draw units one after another either without replacement or with 
replacement. Thus, we have as a result simple random sampling without 
replacement (SRSWOR) or simple random sampling with replacement 
(SRSWR). 


In simple random sampling with replacement a sequence of length« 
including repetitions is selected out of A units and the sample space consists of 
N n possible sequences represented as sample points. In sampling without 
replacement n different units constituting a subset are selected out of a 
population consisting of N units either as an ordered (maintaining order of 
appearance) subset or as an unordered subset. Thus, the sample space in 

( 

SRSWOR consists of either TV^ = N{N - 1) . . . (N - n + 1) or sample 

\nj 


points depending on whether we consider all ordered subsets or all unordered 
subsets. 2.10940 


/V , 

Definition 3.1: A design p is called SRSWOR if/?(.?) = n\t for ordered 


V n) 


yn) 


sample s of size n. In case of unordered SRSWOR sample, p(s) = 1/ 

A design p is called SRSWR if p(s) — 1 /N n , where n or n(s ) being the size 
of .s, fixed in advance. In this design the effective sample size is bJ, the 
cardinality of 5 , that is, number of distinct units in j. 


3.2 Method of Selection of Simple Random Sample 

e method of selecting a simple random sample has the following steps. 
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i) Serially arrange the units from \ toN. 

ii) Read a random number having number of digits equal to number of digits 
in N not having digits all equal to zeros and not greater than N by starting 
from any arbitrary place and this is the serial number of the selected unit 
of the population. Continue the process by observing random numbers 
continuously till n units are drawn. If the same random number occurs 
again, this random number is rejected for sampling without replacement. 
But in case of sampling with replacement, the corresponding unit once 
again comes into the sample. The selection can also be made by preparing 
N chits having serial numbers marked on these chits with one serial 
number on each chit. Then the chits are well mixed and are drawn one 
after another. In case of sampling without replacement, the drawn chit is 
not replaced after being drawn from the lot, whereas in case of sampling 
with replacement the drawn chit is replaced after withdrawal. Thus, in a 
sample drawn without replacement the units are distinct whereas there is 
chance of duplication of units in sampling with replacement. Simple 
random sample with or without replacement can also be alternatively 
drawn by selecting one of possible samples with equal probability. 


3.3 Some Properties of Simple Random Sampling 


i) The probability of drawing any specified unit at any specified draw is the 
same and is equal to 1 IN. For example, the probability of selecting any 
specified unit say U, at the rth draw is 

N-lN-2 N - (r- 1) 1 _ 1 

N N - 1 N — r N - r + 1 N 


In SRSWR the probability of selecting any specified unit at any specified draw 
is also 1//V, because in every draw the lot contains the same number of units 
because of the nature of sampling with replacement. 


>i) 


Every one of 


V n) 


samples in SRSWOR and every one of N n samples in 


SRSWR has equal probability of being selected, 

iii) In simple random sampling every unit of the population has same 
probability to fall in the sample. 


In SRSWOR the probability for z'th unit U,(i= 1,2,... AO to fall in the 
sample, i.e. the inclusion probability 7t, = Prob (/th unit is selected at the 1st draw 
or 2nd draw ... or at the nth draw) = (1/AO + (1/AO +-...+ (n terms) = n/N. 

In SRSWR having n draws the probability that /'th unit U, is included in 
the sample 


ft/ 1 - Prob (Ui is not included in n draws) 
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= i_(i_ \IN) n \ (/ =1,2.AO- 

Suppose/ denotes the number of times /th unit U t occurs in the sample/= o, i 
2,is a binomially distributed random variable having mean«(l/^) an( j 
variance n( 1 /A0( 1 - (1 IN)) = («W for large jV. Thus, for large N, the distribution 
f has an approximate Poisson distribution with mean nlN. 


3.4 Notations 


Lety be the character under study. To each unit of the population U, (/' — 1,2,... 
, N) a real number T, (/ = 1,2,..., AO corresponding to /th value of the variate is 


attached. 

We are interested primarily in estimating the parametric functions such as 

N - N 

population total of y-values Y= XT,-, population mean Y = IT,-/TV, 


/ = 


;=] 


population variance ofy-values 



i.i N 


(A'-l) 

N 



where 


1 N - „ 

■S 2 - X (Y.-Y) 1 , 

N -1 i = i 

population coefficient of variation, population proportion of certain attribute, 
etc. 

Denote y, (/ = 1,2,...,/?) be the value of the characteristic (variate)y for 
the sample unit u, (/ = 1, 2,..., n) selected in the /th draw. 

The sample statistic corresponding to the population mean Y is 

n 1 n _ 

y = Y^y-Jn and corresponding to S 2 is s 2 = - £ (y r ~ v) 2 - 

/ = i «-!/ = ]* 


3.5 Some Important Theorems 

Theorem 3.1: UnderSRSWOR (N, «)the sample mean is an unbiased estimate 
of population mean Y and 

V(y ) = («"' - A/ -1 )S 2 = (1- f)S 2 /«,where/= «/W. 

Proof. 

1 w 

Write y = — £/,■ r,., 

« /=] 

where A is an indicator variable defined as /, = 1, if/th unit of the population is 
m the sample and - 0, otherwise, and Y,'s are a set of fixed constants. 
P U' = 1 ) = n/N and P{I, = 0) = 1 - ( n /N) 
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Thus, 


£(/,) = n/N and E (/, //) = n(n - I)/N(N - 1). 

Now, 

„ _ 1^ 1 I N N V — 

E{y) = E - z I i Y i =1 YjE(I i ) Yj = Z^=K, 
_N i=\ J n i = i /= iTV 

Thus, Fas an unbiased estimate ofT. Further, 


^(v) 




Now, 


= -jJ LT^yCOV (/,-,/,) 


T( /,. ) = £■(/?)- (£ (/, )) 2 ={nl N)-{n! N ) 2 
= («/ /V ) [ 1 - ( /? / /V )] 
cov V it I i ) = E(l i l j )-E(I l )E{Ij) 

= (n/N)[(n-\)l (N-l)-n/N]. 
Thus, on substitutions and simplification 


r(J)= -- 


i n^ow ) 2 


/? N J i=\ N - 1 


= (!-/)—• 
n 


The factor (1 - /) is termed as finite population correction factor , 
abbreviated as f.p.c. 

The standard error of y is 

S.E. (J')srswor = Vk (JUsrswor = (S / -v/fO^/l- /. 

Corollary 1: The estimate of population total K is 

T = yV y,V(Y ) = N 2 V(y)~ N 2 (\ - f) S 2 In 

and 

s.E.(n=/vV(i-/xs/V'0 

Theorem 3.2 Under SRSWR (/V, /?) the sample mean y is an unbiased estimate 
of Kand V(y ) = <j~/n = (1 - 1/AO S 2 /n. 

Proof Define 

1 'V 

y=-iriYi, 

n / = i 

where/ is the number of times /th unit cf the population occurs in the sample. 
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Thus,/ is a binomial variable taking values 0,1,2,.. .n with probability that th 
/th unit is drawn is (1/AO at each draw. Thus, 

£(/,') = n{\IN) and V(f,) = n (1//V)(1 - 1//V). 

As jointly /,' and f, follow multinomial law, cov = -n/N 2 . Now 

' 1 N 

~ ZE{f i )Y i 



7 \\ N 

( 

E(v)=E 

- ZfiYi 

= 


V/?/ i= i 

V 


( 1 \ N — 

- 'LY i = Y 

\NJ i = 1 


Further, 


V( y) = y 


7 I f,r, 


V/7> 


/ = 


n 

nJ 

1 


n 2 


N 


N 


Zr i 2 v(fi)+'Zr i r J cov (/■/,•) 


/=1 

N 


i* J 


n 


L Y? - 
i = i ' N 


N 




n 


V N 


i* j 


N 2 


\ \ N — Q 

-I ur,-y) 2 -~ 


2 V-) — 


v NnJ i= i , n V A fJ n 

Theorem 3.3: Under SRSWOR (N, n), s 2 is an unbiased estimate of S' 2 . 
Proof. Since 

(n - \)s 2 = tiyi - y ) 2 , 

/= i 

( n V 2 \ 

~E{y 2 ) 


r(w - i)i 

E(s 2 )=E 

f » yj\ 

Z — 

n 


U = 1 n 0 



N 

ZYr 
i = 1 


(V(y)+ E 2 (y)) = [(n-\)/n\ S' 2 , 


from which the theorem follows. 

Alternative Proof : Write 
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1 

s~ 


L- IK y, -yj) 


2 _ 


f n\ i > J 


n 

. 2 , 


II/,/,(y,-y ,) 2 

i >./' 


Now 1 jv 


- ^ ) 2 


U; 

-T^-rzay 

N(N- 1) '>./ 


<- r ^ 2= ^A II(y '- r ' )2=52 

->J 


2 ) 

2\ = ~2 


Theorem 3.4: Under SRSWR (A, n\ E(s z ) = a 
Proof. Now, 

— £( S 2)=1 |y.2 -(K(3>)+F 2 )= (w ~ 1)ct - 

n N /=1 ' ” 


from which the theorem follows. 


3.6 Comparison of Efficiency (Precision) of SRSWOR and SRSWR 

Express f(y) S RSWOR = [1 — (« — 1)/(1V - 1 )W ln and H^srswr = o 2 /n. Thus, it 
follows that for n >1, KFIsrswor < KtOsRSWR- Defining efficiency of an 
estimate as the inverse of variance, we may infer that SRSWOR is more 
efficient than SRSWR to estimate the population mean by the sample meany. 


3.7 Estimators of Variance and Standard Error 

An unbiased estimator of Vfy) under SRSWOR ( N , n) is given by (Theorem 
3.3) 

Est.^(y)sRSwoR = Est.(l -fi&ln =0 -f)s 2 /n as E(s 2 ) = S 2 . 

An unbiased estimator of V(y) under SRSWR ( N , ri) is given by 
Est.T(y)sR SWR = Est (a 2 /w) = s 2 /n, as E(s 2 ) = a 2 (Theorem 3.4). 


Alternative Method 

0 For SRSWR, 

EstI/ (y)=-f Est —Y f, 2 -Estf 2 
"V N& 1 
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2 

n 


1 " 

-Zyf 

ynf:1 


) 

-EstY 2 


Using the relation, Est. F(y) = y 2 - Est. Y 2 or equivalently 

Est. Y 2 = y 2 - Est. K(y). 

Thus, n Est F(y) = (1/ n) X y 2 - y 2 + Est K(y) 

Therefore, on simplification Est. V(y) = s 2 // 7 . 


ii) For SRSWOR 


A 


EstF(y) = 


n- 1) 1 


1- 

v yv -1;« 


n -1 ) 1 


1- 

V N-V n 


-£y 2 -Estf 2 

\ n t1 

-T,yf-y 2 +Est Y(y) 

V«/=i 




which, on simplification, leads to 

EstK(y)=(l- f)s 2 / n, f-n! N. 

An exact unbiased estimate of S.E. (y)'is not available. However, a biased 
but consistent estimate of S.E. (y) in SRSWOR is given by 

Est. S.E. (j) = 


It has been observed by Hansen, Hurwitz and Madow (1953) that 
approximately E(s ) = S [1 - (1/8) V{s 2 )/S x ], showing negative bias for s. 

Again under SRSWOR 


V(s 2 ) = S 4 


(N-n)(Nn-N-n-l)(N-l) 
n(n-\)N(N-2)(N-3) 


P2(T) 


(.N-n){N 2 n-3N 2 +6N-3n-3) 
n(n-\)N(N-2)(N-3) 


(provided N>3) 


= (1- f)(S* / n)[fi 2 (y)-(n-'i)/ (»-l)], 

if N is moderately large compared to n\ P 2 (y) being the coefficient ofkurtosis in 
the population. For a normally distributedy, V(s 2 ) = 2S*/(n - 1). 


3.8 Alternative Improved Estimate of Population Mean in with 
Replacement Sampling 

We have already considered an estimator in SRSWR based on values of 
selected units in the population. Basu (1958) and Des Raj and Khamis (1958) 
suggested an estimator based on distinct units and showed that their estimator is 
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always better than the estimator, which takes into account repeated values. 

Theorem 3.5: Let there be u distinct units in SRSWR (TV, n) and let/ r be the 
frequency ofrth distinct unit occurring in the sample. Then 

E{y u )=Y y(y u ) = {E{\! u)-\l N)S 2 

and 

v(y*)*v(yn ), 

where 



Proof. For a given u number of distinct units in an SRSWR (N, n), the 
sample of u distinct units can be regarded as a sample selected without 
replacement. Thus 

E(y„ I U) = Y 
and 


Now 

E(y„) = E[E(y„\«)] = E(Y) = Y 
and 

V(v„) = E[V(y„ I u)] + V[E(y„ \ u)] 

= E(u- { - yv-')5 a = [£(l/w)-yv- 1 ]5 2 . 


Again, for a given set of u distinct units, the probability that a specified 
distinct unit w, will be selected at any of the/? draws is 1/m and therefore 


E(fr I m) = n(\/u) 


Thus, E(y n | m) = E[(\/n) I f.y r ] 



= y» 


Hence, 


V{y n ) = E[V{-y n \u)\ + V[E{y n \u)\ 
= E[V(y„\u)\+ V(y u ), 


which proves V{y u ) < V(y n ) (Des Raj, 1968). 

As shown by Pathak (1961), Korwar and Serfling (1970) and Asok (1980) 


E 



V uJ 


N 

= N- 2>-' 
1 


= ~‘ + ~ 77 + ——1/ + 0(W _4 ),for « > 2 
n 2N 12 N 2 
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Hence, 

V(y„) = [\ln-\l2N + (n- 1)/12 N 2 + 0(/V ~ 4 )] S' 2 , 

= [(1//7)- (1/2AO] S 2 , 

neglecting terms of degree greater than 0(1 /N). With the help of approximate 
expression for f(y H ) it may be verified that 

V(yu) < F(y»)sRswR for n > 2. 

For u> 2 an unbiased estimator of S' 2 is provided by 




An unbiased estimator of V(y u ) is given by 

Est V(y u ) = |Y — - —1 + 

\u NJ N n -N 


where 

s 2 =s 2 forw>2 
= 0 for u = 1 

As the contribution of second term inside the bracket is likely to be 
negligibly small, Est V(y tl ) = {u 1 - N~ ') sjj. 

Pathak (1962a) has considered several estimators of population mean in 
SRSWR based on distinct units which belong to the class 

y/(u) = [/(«)/ E(f(u))\y u . 


3.9 Confidence Interval for Population Mean 


If y is exactly normally distributed, the confidence interval for? with desired 
confidence coefficient (1 - a) where a is a small probability that will be 
outside the interval, can be computed using usual normal theory in the form 

y-t a / 2 TjvW)<Y <y+t al2 4 v (y) 

or 


y- t a/ 2 -jEstV(y)<r <y + t all jEstV(y) 

where ( 0 depends on a, according as whether V(y) is completely known or is 

-- f r0m * he sam P le of size n. The value of t^i in the first case is 

student ’, 0 !\Z n0rmal PHobahiUty integral table and in the later case from 
Student s /-table corresponding to (n -1) degrees of freedom. 

Thus, 100 (1- a)% confidence limits for Fare 

T±'a/2#(Fj 
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or 

Write y±t al2 -/EsTF^J) 

as the case may be depending on whether popuiafion variance a 2 is known or 
estimated from the sample. 

Confidence interval for population total Y can be calculated by 
multiplying/V with the confidence limits forf. In sampling with replacement 
from finite population yfs (/ = 1, 2, . . . n) are independent and identically 
distributed random variables and asymptotic normality of the sample meany 
follows from the central limit theorem under Lindberg-Levy conditions. 
However, in sampling without replacementy/s are no longer independent. A 
framework for the finite population asymptotics is provided by Hajek (1960), 
following the work of Madow (1948) and Erdos and Renyi (1959). He 
considers an infinite sequence of simple random sampling experiments, therth 
of which has the size/?,, and refers to a population of size N r and with values y,\, 

yn, • • -OVv, • 

n r 

Let 4/ = j].y n be the random variable corresponding to rth sampling 

i=i 

experiment. Supposing n r —> oo, he lays down conditions concerning (y r -> Nr) 
under which converges to a limiting normal distribution law. We state the 
following theorem due to Hajek (1960). 

Theorem 3.6: Let U r , be the subset of elements of U r = (1,2,..., N r ) on which 
the inequality 

Yn-Y^Jva,) 


holds; let V(^ r ) denote the variance of \ r given by 




N r 


suppose that n r -> oo and N r -n r -> oo. 

Then the random variable is asymptotically normally distributed with 
mean E(^ r ) and variance V(t, r ) if and only if 


lim 
/ '—> 00 


Z(y'i- y r) 2/ Ifj'rZ-J'r) 2 

ieU rl ieU r 


for any t > 0. 


This implies 

y r =i r ^N(E& r ) t V (\ r )) 


In case of simple random sampling with replacement Pathak (1962) 
showed that the sample mean y u based^n the distinct units is asymptotically 
normally distributed with parameters (T, (!/£(«) - \/N) (yv - 1)- 1 Z ( Y, - Y) 2 ) 
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under Erdos-Renyi-Hajek condition. 


3.10 Estimation of Proportion 

In a sample survey sometimes we may be required to estimate the proportion/* 
of units in a finite population belonging to a particular classC, for example, the 
proportion of villages in a region having no irrigation facilities, proportion of 
unemployed persons in a locality, proportion of factories having fixed capital 
exceeding certain specified amount, proportion of persons having certain 

attribute, etc. . .. 

The estimation of proportion P in a finite population is the same as the 

estimation of population mean Y where a dichotomous variable f, is attached to 

each unit of the population and takes values either 1 or 0 depending on whether 

the /th unit belongs to class C or not. Thus, 


N 


Y] =N] = number of units belonging to class C. 


i=i 


Hence, Y = f—) V Y ( = 

\nJ jti N 


Ni 


= P. Consequently, 


l N 

g 2 = —Y Y? - Y 2 =P-P 2 = PQ, 

N& 1 


where Q = \ -P. 

Suppose an SRSWR of size n is drawn from a finite population. The 
sample values^i,^, •••,>’« are also dichotomous variables taking values either 
1 or 0 depending on whether the sample unit belongs toC or not. 

Then 


,t{ n 


y-i «i 


n 


=P, say 


where n\ is the number of units in the sample falling in classC. As E(y) = Y, it 
directly follows by redefining the values ofy that E(p ) = P and V{p) = V(y) = 
G 2 /n = PQ/n. 

In SRSWOR it may be easily seen that E(p) = P and 

yip) = v iy) = (« _1 S 2 = [(N - n)/(N - 1 )]PQ/n. 

Hence, we have the following theorems. 

Theorem 3.7: Both in SRSWR and SRSWOR the sample proportion p = n\/n is 
an unbiased estimate of the population proportion P = N } /N. 

Theorem 3.8: The variance of p in SRSWR is V(p) = PQ/n and in SRSWOR is 

V(p) = [{N-n)/{N- 1)] PQ/n. 
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Corollary 1. S.E.(/?)srswr = ^PQ l n and 

S.E.(p)srswor = -n)/(N — 1) -yjPQI n 

Corollary 2. Let TV ] = £st ( N \) = Np denote the estimated total number of units 
in class C. Then in SRSWOR V(N ] ) = N 2 V(p). 

As s 2 = (fyf - ny 2 )/(n - 1) = [1 /(« - \)]{np-np 2 ) 

= [/?/(/?- 1)] pq, q=\-p , 

the unbiased estimates of V{p) in SRSWR and SRSWOR are pq/{n - 1) and (1 - 
f) pqt '(« - 1) respectively. As 

( P-P) 

yj[{N -n)l N] pql (n- 1) 

is approximately normally distributed for large/? and for p not very small (or 
very large), the confidence limits forf with confidence coefficient (1 -a) are 
given by 

P±t a /2 V(l-/)P(l~P)/(«-l) 

In case P is very small (or very large) or n is small or both are true, approximate 
confidence limits in case of sampling without replacement are obtained by 
using Hypergeometric distribution for which tables have been prepared by 
Liberman and Owen (1961) and Chung andDelury (1950). For N large or in 
case of sampling with replacement Hypergeometric distribution is approxi¬ 
mated by Binomial distribution. 

3.11 Determination of Sample Size n for Estimating Population Mean Y 

a) Specified margin of error. With permissible margin of error d and a small 
risk a to incur an error larger than d, we require P(\y-Y | > d) = a. 
Assuming y to be approximately normally distributed with standard 
deviation a(y) = V( \-f) SNn. We have d= t a/2 V(1 -f) sNn, where t a is a 
% point of the standard normal distribution, which gives /7 = /z 0 /(l + n 0 /N). 
Where, n 0 = (tanS/d) 2 , as a first approximation to sample size n. As the 
formula for/7 is dependent on S 2 , an advance estimate ofS 2 from the past 
survey on the same population or from a similar survey or from a pilot 

survey may be utilized to determine sample size « for the current survey 

b) Fixed relative standard error. If coefficient of variation CV = o/Y is 
known the sample size/? may be determined with any prefixed relative 

For SRSWOR (RSE) “ C ° effiCieM ° f Va " ation f ° r the estimator 
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CV(y) = 


j (N-n) CV 

V(yv-i) 4~n 


Then, the sample size required to ensure rseof E% for the estimator of Kis 
given by solving the equation 


l (N-n) CV 

\(N-\) 7T 


for n. 


This gives, 

N{CV) 2 

n ~ (N -\)E 2 +(CV) 2 ' 

c) Fixed cost : If the sample size n is subject to budgetary constraints, then 

assuming linear cost functionC = Co + n Cj, where C = total survey cost, 
Co = overhead cost and C\ = cost of surveying one unit, we have, n = (C- 
C 0 )/C,. 


3.12 Determination of Sample Size n in Sampling for Proportion 

a) Specified margin of error. Given (i) chosen margin of errors/between the 

sampled proportion p and population proportion P. (ii) a small risk a 
which is going to be incurred if actual error is larger thand. 

Under SRSWORa p = ^{N-n)/{N- T) y/PQ/n 
Then, 

P(\p-P\>d) =a, 

which implies 

where t a /2 is the a/2 % point of the standard normal distribution which 
cuts off an area a/2 at both the tails (assuming (p - P)Nv(p) to be 
asymptotically standard normal). Solving for/7, n = /7 0 /(l + (m - 1)/A0, no 
= t^ 2 PQ/d 2 . As P is not known, an advance estimate/?* off from the 
past experience or from pilot study may be substituted in the formula for 
determining n. 

As a first approximation for large N, n = no = t^ 2 PQ/d 2 . 

b) Fixed coefficient of variation’. It is desired to estimate p with coefficient 
variation C or less, we should have V(p)/P 2 < C 2 , which implies 

n > (1 - P)/[( 1 - P)/N + C 2 P(N- 1)//V] = (1 - P)/ C 2 P, 

for large N. 
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Numerical Examples 

Example 3.1: Draw a simple random sample without replacement of 10 
households from a list containing 325 households in a village. 

Solution : Serially arrange the households from 1 to325.yV = 325 is a three 
digit number. All possible three digit random numbers are 000, 001,.. 999. 

Start selection three digit random numbers from any point in the Table of 
Random Numbers 

036571 1308184542636081 
5708257261 133895953242 
712455 1864294928069035 
5644341254996917055 

given random number table (say from the beginning), rejecting numbers greater 
than 325, the number 000 and also the repetitions. The random numbers so 
selected correspond to serial numbers of the households. Thus, the serial 
numbers of the selected households are 

036 130 263 157 082 

124 294 069 035 125 

This procedure rejects a large number of random numbers in the process 
of selection. An alternative procedure commonly adopted is to divide the 
highest three digit number 999 by 325 and subtracts the remainder 24 from 999. 
This is 975. Now consider three digit random numbers from 001 to 975 and 
reject random numbers 976-999 and 000. If a three digit random number chosen 
is in the permissible range that is between 001 and 975 and is less than 324, the 
household having serial number corresponding to said random number is 
selected. It is greater than 324, divide the random number by 325 and take the 
remainder as the serial number of the household selected (remainder zero 
corresponds to serial number 325). 

In this revised procedure repetitions are also rejected. Thus, the serial 
numbers of the selected households are 

036 246 130 168 129 

263 283 157 082 24 7. 


Example 3.2: N = 4, Finite population (U h U 2 ,_U,,U t ) have values 2.4,6 and 
12. For a SRSWOR of size n = 2, verify that E(y) = Y, E(s 2 ) = S J and Kfvf = UN 
-n)/N]SVn. 

Solution'. 


s 


2 


SI. No. of Sample 
possible elements 
samples 


Sample Sample 

values(y) mean(y) 
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1 

Ui ,U 2 

2,4 

3 

2 

2 

U:,U 3 

2,6 

4 

8 

3 

Ui ,u A 

2, 12 

7 

50 

4 

u 2 ,u 3 

4,6 

5 

2 

5 

u 2 ,u. 

4,12 

8 

32 

6 

u 3 ,u 4 

6, 12 

9 

18 


E (y) = Sy/6 

= (3+ 4 + 7 + 5 

+ 8 + 9)/6 = 36/6 

= 6. 


K=(Sr,)//Y = (2 + 4 + 6+ 12)/4 = 6. 

Thus, E{y) = Y. 


E{s 2 ) = L(s 2 )/ 6 = (2 + 8 + 50 + 2 + 32 + 18)/6 = 56/3. 

S 2 = (ET, 2 - NY 2 )/(N- 1) = (200 - 144)/3 = 56/3. 
Thus E(s 2 ) = S 2 

Sample(f) 1 2 3 4 5 

/ 3 4 7 5 8 

Y- V -3-2 1-12 

(y-V) 2 9 4 1 1 4 


Further, 


Thus, 


f(y) = E(y- Y) 2 = (9 + 4 + 1 + 1 + 4 + 9)/6 = 14/3 
[{N-n)!N\ S 2 /n = [(4 - 2)/4] [56/(2 x 3)] = 14/3 


6 

9 

3 

9 


V(y) = [(N-n)IN]S 2 ln. 

Example 3.3: For the population considered in Example 3.2 and with simple 
random sample (with replacement) of size a? = 2, 

i) Verily that the sample mean y is an unbiased estimate of Y. 

ii) Compute V(y) with the help of all possible samples. 

iii) Compute the estimate of relative efficiency of SRSWOR if the sample 
values areyi = 4 and >>2 = 6. 

Solutions'. 


SI. No. of 

Sample 

Sample 

Sample 

s 2 

possible 

elements 

values(y) 

mean(y) 


samples 

1 

Ui, u, 

2,2 

2 

16 

2 

Ui, u 2 

2,4 

3 

9 

3 

Uu U 3 

2,6 

4 

4 

4 

Ui, U 4 

2, 12 

7 

1 

5 

Ui, ^ 

4,2 

3 

9 

6 

U 2 , U 2 

4,4 

4 

4 

7 

U 2 , U 3 

4,6 

5 

1 

8 

U 2 , U 4 

4,12 

8 

4 
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Uz, Ui 

6,2 

4 

4 

9 

Uz, U2 

6,4 

5 

1 

10 

C/ 3 . U 3 

6,6 

6 

0 

11 

1 / 3 , Ua 

6, 12 

9 

9 

12 

A A 

Ua, Uy 

12,2 

7 

1 

13 

A A 

U A , U 2 

12,4 

8 

4 

14 

A C 

Ua, Uz 

12,6 

9 

9 

15 

16 

Ua, Ua 

12, 12 

12 

36 


E(y ) = Ey/16 

= 96/16 = 6 = 7 




^(^srawr - £(.V _ T) 2 /16 112/16 7. 

For sample of size 2 with sample values 4 and 6, 


s 2 = [l/(/7 - 1)] (Ey, 2 - ny 2 ) = 2. 

Est. K(y)sRSWOR = [(N - n)/N] (. s 2 /n) = (2/4)(2/2) = / 2 . 

Est. F(y)sRSWR = s 2 ln = 2/2 = 1. 

/IE = Est K(y)sRSWR/Est F(y)sRSWR = \/ l / 2 = 2. 

Example 3.4: In a sample survey to estimate the population mean Y, a simple 
random sample without replacement of 25 units are drawn from a finite 
population consiting of 150 units andyi,y 2 , • • -,^25 are observed. 

Ey 2 = 23428, Ey = 650 

Compute an unbiased estimate of Y and estimate its standard error and 
95% confidence limits for Y. 

Solution : An unbiased estimate of Y = Sample mean y = E650/25 = 26. 
Further, 

s 2 = (Ey 2 - ny*)l(n - 1) 

= (1/24) (23428 - 25 x 26 2 ) = 272. 

Estimate of S.E. (y) 

= tJ(N- n)lN -Js 2 In 


3 approximately. 


= v(l50— 25)/ 150 V272/ 25 

The confidence limits for "Parey ± f 05 Est. S.E. (y). 

of St (y) = 3 and fos = 5% percentage point (two taile 

the pal ^ S ^ distri ^ ution with n - 1 (= 24) degrees of freedom = 1.96. Thi 
Iculated confidence limits are 26 ± (1.96)(3). 

c haracteri 7 pH if ° r a population (^i» Ui, Ui, t/ 4 , U s , U 6 ) of size 6 the units 8 
acterized by either A or B as follows: BAB ABB 

Fo r P ro P orti on of A in the population = 2/6 = 1/3. 

an u nbiased^Tbmate of ^ ) Calculate V{p) 111(1 verif V that (1 - n/N) pqi{n - 1) 
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Solution: /V = 6, « = 2 


V(p) = [(N-n)/(N- 1)] PQ/n = (4/5)(l/3)(2/3)(l/2) = 4/45. 


Sample 

(Ui. u 2 ) 

(Ui, l/ 3 ) 

(l/i, U A ) 

(t/i. Us) 

(t/i. Us) 

p 

y 2 

0 

Vz 

0 

0 

P<7 

Va 

0 

Va 

0 

0 

Sample 

{U 2 , u 2 ) 

(t/ 2 . u A ) 

(l/ 2 . Us) 

(l/ 2 . 1/6) 

( 1 / 3 . 1 / 4 ) 

P 

Vz 

1 

Vz 

% 

Vz 

pq 

Va 

0 

Va 

Va 

Va 

Sample 

(U 2 , U 5 ) 

(1/3. Ue) 

(1/4. Us) 

(1/4. Us) 

(Us, Us) 

P 

0 

0 

Vz 

Vz 

0 

pq 

0 

0 

Va 

Va 

0 


Now £[(1 - nlN) pq/(n - 1)] = 4/45 - V(p). 


Example 3.6: In a simple random sample (without replacement) of size 200 
from a population of size 1000, 75 persons happened to be smokers. Estimate 
the proportion of smokers and total number of smokers in the population and 
calculate 95% confidence interval for the proportion of smokers and also for the 
total number of smokers in the population. 

Solution : N = 1000, n = 200. Estimate of the proportion of smokers =p = 

75/200 = 3/8 = 0.375. 

The estimate of total number of smokers 

N ] =Np =1000(0.375) =375. 

Est.S.E.(p) = ^V(p) = J\-n/N yjpq/(n- 1) 

= 0.0307 

The confidence limits of/ 5 are 

p ± tan^np) = 0.375 ± (1.96) (0.0307) 

Thus, the 95% confidence interval of P is 0.3148 < P < 0.4352 and that of N\ = 
N p is approximately 315 < N\ < 435. 

Example 3.7: In an agricultural survey a random sample of200 households out 
of 600 households in a village gave the following data: 

Holding size Number of households 

(in acres) 

0-2.50 78 

2.50-5.00 89 

5.00-10.00 30 

>10 3 
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Make a 95% confidence statement on the fraction of all households 
having holding size greater than 2.5 acres. 

Solution. The sample proportion of households having holding sizes 
greater than 2.5 acres is p= 122/200 = 0.61 

Est S.E. (p) = <J\-n/N -yfpq /(«- 1) 

= /l - (200/ 600) 7(0.61) (0.39)/199 = 0.0231 
A 95% confidence statement is given by 

Prob [p - 1.96 Est S.E .(p) <P<p+ 1.96 Est S.E.(p)] = 0.95 

That is, 

Prob (0.5647 < P < 0.6553) = 0.95. 

Example 3.8: Suppose the coefficient of variation of income of households in a 
region is 75%. How large a random sample of households is required for a 
margin of error ± 5% with 95% confidence fora 

a) region with 2000 households? 

b) region with 5000 households? 

Solution: (C.V.) 2 = (ct/F) 2 = (0.75) 2 
Therefore, 

a 2 = (0.75) 2 T 2 , d=± (0.05) Y 
Calculate 

no = MS = [(1.96) 2 (0.75) 2 K2]/(0.05) 2 T 2 = 865 

a) n = «o/(l + no/N) = 865/(1 + 865/2000) = 604 

b) n = «o/(l + no!N) = 865/(1 + 865/5000) = 738. 


Example 3.9: For a finite population of size N = 500, the population proportion 
is known to be 0.85. How large a sample should be taken in order to estimateP 
with margin of error 0.05 with confidence coefficient 0.95? 


Solution: P = 0.85, Q = 1 - 0.85 = 0.15,^= 0.05 


2 PQ _ (L96) 2 (0.85) (0.15) 
"°" /<x '2 d 2 ~ 0.0025 

n= 196/[ 1 + (196- l)/500] = 141. 


= 196 


5 *0.025 - 1.96 


EXERCISES 

/ 


3.1 


A simple random sample of 25 households 
Community Development Block of a district 


was selected from a 
comprising of 1597 
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households. The sample households consisted of following sizes. 

2, 5, 4, 2, 6, 3, 2, 4, 5, 7, 4, 2, 3, 5, 3, 3, 2, 4, 1, 3, 2, 1, 3, 4, 5. 

Estimate the total number of people in the Block and compute the estimate 
of the standard error and 95% confidence interval for true total. 

3.2 In a poverty survey conducted in an area comprising of 300 villages a 
simple random sample of 18 villages was taken and the number of 
households below poverty line was counted for each sampled village as 
given below. 

Number households below poverty line: 
012345678 
8 0 4 3 2 0*2 1 5 

Estimate the total number of households below poverty line in the villages 
and compute the estimate of standard error. 

3.3 An interviewer while interviewingSRSWOR of 12 persons selected out 
of 100 persons observes not-at-homes among the respondents as follows 
(denoting present-at-home and not at home as 1 and 0 respectively). 

SI.No. of 

respondents 1 23456789 10 11 12 
Not at homes 111000111 1 0 1 

Estimate the proportion of not at home and give standard error and 
compute 99% confidence interval for population proportion of not at 
home. 

3.4 Prove that a simple random sub-sample of«i units from a simple random 
sample of n (> n\) units is also a simple random sample of«i units from 
the original population. 

3.5 Let n\ be size of a simple random sample drawn from a population 
consisting of N units. Further let «2 be the size of another simple random 
sample drawn from the sub-population consisting ofN- r\\ units not 
covered by the first sample. Show that the combined sample of size«i + «2 
is a simple random sample from the original population. 

3.6 A population consists of /V units, out of which the values oik units Y\, Y 2 ,. 

.., Tk are known. An SRSWOR of size n is drawn from reset of N-k units. 

k 

Show that the ^ X, + {N -k)y n has smaller variance than Ny n based on 

i = 1 

SRSWOR of size n selected from the entire population. 

3.7 From a simple random sample of n units drawn from ( a population 
consisting of N units, a sub-sample of m units is drawn and kdded to the 
original sample. Show that the mean based onn + m units is an unbiased 
estimate of the population mean. Compare the efficiency of this estimator 
with the one based on original n units. 
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3.8 A population consists of * large units. Select SRS of (n - k) units out of N- 
k units, eliminating* large units and with this sample include* large units 
outright to make it a sample of sizew. 

Suggest an unbiased estimator of population mean and derive its variance 
estimator and comment. 

3.9 From a population consisting ofN-units, draw an SRS of n\ units. Next, 
draw an SRS of «2 units from the remaining N - n\ units. Let y ni and 
v,„ be sample means based on two samples. Suggest an unbiased 
estimator of population mean based on both the sample means and find its 
variance. 

3.10 In an SRSWR of size n for what value of 7,,|i = "Ky has minimum mean 
square error? Hence, show that if advance knowledge on the coefficient of 
variation a(y)/f is available, 

min. MSE (|i )<F(y). 


3.11 Let P denote the proportion of a rare attribute in a population of size N. 
Units in the sample are drawn without replacement one after another until 
m units having rare attribute are observed. 

Show that random variable n has the probability distribution. 

r NP V NQ ' 


P(nm,NP,N -NP) = 


\m- \J 


\n-mj 


NP-(m-l) 


' N > 

VH-U 


yv-(n-l) 


where Q= 1 -P and n = m, m + 1,_Show that p = (m- 1)/(«- 1) is an 

unbiased estimate of P and an unbiased estimator of V(p) is 


v ( p)=zM{ 1 _^] 
n-2 V N J 


What is the limiting distribution of/? for large N? 

3.12 Show that in SRSWOR (/V, n) 

i)E(y-Y) 1 

(/V - l)(N - 2) n 2 


i i) g(A/ 3 )= (ji^ 1 )(«- 2 ) 


J\ 


3.1: 


n 2 (N-\)(N-2) 


1^3 


where W, and hj are third sample and population moments respectively 
In a survey to study drinking water facilities in a rural area covering 315 
households in a village it was observed that 272 of thpm • 

wells andlhe rest had staed „ M |,«, 'e"™ 

the sample size to estimate proportion of househnldA, • 
wells with 10% margin of error and 99% confidence coefficient. ^ ° Wn 
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3.14 In an agricultural survey complete enumeration of 435 households of a 
village had average holding size (f) = 3.42 acres and standard deviation of 
holding size (a) = 1.15 acres. 

With simple random sampling how many households must be selected to 
estimate Y within 5% apart from a chance of 1 in 20? 

3.15 Suppose 1,2 ,..k are k independent simple random samples of sizes n\, 
n 2 , ■ • nic from the same finite population of size N. 

i) Show that in case of both SRSWR and SRSWOR 

* k 

v = (l//i)£n f yr, n= 

i'=i /=i 

is an unbiased estimator of the population mean Y . 

ii) Find the variance of J>and its unbiased estimator. 

k 

iii) Prove that y' = k~ ] ^ p, is less efficient than y. 

<=i 

3.16 In SRSWOR, prove that 

MSEOv) =E(Sy-Sy) 2 = 2Sy (Sy ~ E(Sy)) 

1 V(s 2 )1 Vis 2 ) 

Show that E(s y ) = S y 1--,andF(5_ v )=- y —. 

8 S y 4 »S y 

3.17 Derive the bias and mean square of the following estimators of the 
population mean in SRSWR 

i) t\ = ny/^n + s 2 ly L ). 

ii) t 2 = y+ ko 2 /ny, (Upadhyaya and Srivastava 1976), where k is a scalar 
and a 2 is known. 

iii) r 3 =-^-, (Srivastava and Bhatnagar, 1983), where 

1 + ka 2 / (ny 2 +ga 2 ) 

k and g are known constants and cr 2 is known. Comment on their 
properties. 

3.18 In a college having 3000 students 45% are females. Estimate the sample 
size to estimate the proportion of female students so that the marginal 
error is 5% with 95% confidence coefficient when the sampling is carried 
out (i) with replacement and (ii) without replacement. 

3.19 Let ( Y„ X,) be paired observations corresponding to variates y and x, 
defined on every unit in the population of siz eN. For an SRSWOR of size 
n define v and 3c as the sample means. 

Show that Eiy-Y)(x-X) = {n- { -AM )^ 

/Tt (/V-l) 

where Y and X are population means ofy and x respectively. 

3.20 Let Jit, t') and g(r, /') be two functions of random variables t and Show 



that 
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C0V (A g) £[C0V (/,' g | /')] + cov [£(yf | ^ £(g | ,')] 

") ^ = E \. y (f\ O] + L[£(/| /')]. 

PP ose a Population is known to contain an unusually low valuer and 

usua y igh valueyvv. For SRSWOR of size n, consider the follow¬ 
ing estimator 


r — __ 

the sample contains^ but notjw, 
~ ^ ’ if the sample contains but not>>i, 

- jy for all other samples, 


where C is a constant. 

unt 

S 2 


( n') 


1- — 


v NJ 



n 


2 C 
N -1 


( y n ~y\ ~nC) 


(Sam dal, 1972). 


•22 Suggest an estimate of variance when population mean is known and 
mention its properties. 

.23 Prove that in simple random sampling with unlabelled observations^,,^, 

. . ., y,„ the sample mean y is the best linear unbiased estimate for the 
population mean Y. 

.24 Let a population consist of three units uj, u 2 and w 3 with variate values *,, 
a '2 and x 3 respectively. Possible probability samples are s, = (w,, w 2 ), s 2 = 
(wi, ii 3 ), S 3 = (W 2 , W 3 ). P(sj) = y 3 , (/= 1,2, 3). Define estimator/, as *,($,) 
= ><JC| + % x 3,and/i(53) = / J^X 2 + / J^.x: 3 . Define another 

alternati ve estimator t 2 as / 2 (? 1 ) = X (*1 + * 2 ), h{s 2 ) = ]/ 2 (*, + x 3 ) and / 2 CS 3 ) 
= /-, (xi + X 3 ). Compare /, and t 2 as estimators of population mean and 


comment. 

.25 Two lists of ration cards for the same population consist ofNand Mnames 
with D names common to them. SRSWR of sizerc and m are selected from 
the two lists respectively and d names are observed to be common to two 
samples. Show that D = ( NM/nm)d is an unbiased estimate of D. Find 
L(D)and its unbiased estimate. 



4. UNEQUAL PROBABILITY 
SAMPLING 


4.1 Introduction 

Equal probability sampling presupposes equal importance attached to units in 
the population while selecting them into the sample. No special characteristics 
of the unit except the value of the study variate is attached to the sampling unit. 
In certain circumstances this seems awkward, for example, a large unit (unit 
with large value for the measure of size) appears to contribute more to the 
population total or mean in question. Therefore, it is natural to expect that some 
method of probability sampling which takes into account varying sizes of the 
sampling units could be a more justified sampling technique. Thus, a scheme of 
sampling with varying probabilities of selection or unequal probabilities of 
selection has come to take a place in sampling theory literature using auxiliary 
information at the selection stage and is popularly termed as sampling with 
probability proportional to size (PPS). The word ‘size’ here is used in a broader 
sense and relates to any auxiliary character supposed to be correlated with the 
main character under study. For example, information on some measurable 
socio-economic character collected during previous survey related with the 
main character under study in the current period may be used as a size measure. 
Similarly in a household survey where the villages are the sampling units, the 
number of households in a village may be taken as a measure of size. Further, in 
estimating crop characteristics in a crop survey, the cultivated area for a 
previous period or size of the plot is usually preferred as a good measure of size. 
In an industrial survey of establishments, the number or workers or the size of 
the fixed capital may be taken as the size of the industrial establishment. The 
PPS selection procedure may make larger units over represented than the 
smaller ones, thus giving rise to inflated estimates. But this possibility can be 
corrected by properly weighting the units with probabilities of selection at the 
estimation stage. It may be pointed out here that the relative importance of the 
units of a population expressed in terms of scores based on some qualitative 
characteristics may also be taken as size measure for the purpose of PPS 
sampling. The technique of PPS sampling was first put forward by 
Mahalanobis (1938) while sampling plots for a crop survey and its details were 
worked out by Hansen and Hurwitz (1943). However, the theory of unequal 
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probability may be qualified under other considerations and also may be 
considered in a more general frame work. 

The units in unequal probability sampling (UPS) orPPS sampling may be 
selected with or without replacement. In sampling with replacement the 
probability of selecting any specified draw is the same and is equal to the initial 
probability of selection. But, in sampling without replacement the probability 
of selecting a particular unit changes from draw to draw. 


4.2 Sampling with Unequal Probabilities with Replacement 

4.2.1. Methods of selection of PPS (wittusplacement) sample 

Let y be the character under study and* be the size measure. To draw a sample 
of size n from a population of size/V, denoted, as an integer proportional to size 
of the ith unit (/'= 1,2,..., N). Form successive cumulative totals ofA’s asAj, 
X\ + Xi, . . X\ + Xj + ... + Xn and draw a random number R not exceeding 

N 

^ X i = X from table of random numbers. If/? lies between X\ + Xj, + . . . + 

»=i 

Xk -1 and^i +X 2 +... +Xk then £th unit is selected. Repeat the procedure until/? 
units are selected including repetitions. Thus, the probability of selecting the 
Mi unit is 


(k k-\ \ 

IX,-- I Xi 
P(U k ) = - ---- 


X_k_ 
X ' 


The procedure mentioned above becomes tedious and a time consuming 
process when N is large. Lahiri (1951) has devised a procedure which avoids 
cumulation of size measures. In this procedure a pair of random numbers (??, S ), 
where 


1< R < N and 1 < S < max (X ]i X 2 ,..., X N ) = M (say) 

or something greater, is selected. If the second random number is less than or 
equal to A*, the Rth unit is selected, otherwise the pair is rejected and another 
pair is chosen. The procedure continues till n units are selected. 

The probability of selecting kth unit in a draw isp k = (\ /N)(X k /M) and the 
probability of selecting any unit at an effective draw is the sum of the 
probabilities of selecting different units, given by 


Thus probability of rejecting a draw is I -(X/M)- Therefore, the probability of 
select.ng*th unit=p* + (1 -X!M)p k + (\ -X/M) 2 Pk + ,.. =Xi,/NX, s/nce 
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|l -XtM\< 1. 

The main advantage of Lahiri’s method is that it does not require to write 
down the cumulative totals of sizes and further the sizes of all the units need not 
be known to draw a sample. What is needed is only some number greater than 
the maximum size. However, sometimes in this method large number of draws 
may be rejected resulting in wastage of time and effort. This can be partially 
avoided by split method, in which the units having large sizes aresplitted into 
two or more split units and then distributing their size Over split units. After this 
process, Lahiri’s method is applied to select one unit from the totality of all split 
units and unsplit units with prefixed rule that whenever a split unit chosen, the 
original unit to which it belongs is considered as selected. 

If the frame consists of a map showing the boundaries of units according 
to scale, the areas occupied by the units are proportional to their sizes and thus 
the pps selection can be made by following a graphical procedure by selecting a 
pair of random numbers regarded as co-ordinates to locate a random point on 
the map and then selecting the unit within which this point falls. 

Illustration: Consider a population of 10 factories. It is required to select a pps 
sample with size being number of workers (Table 4.1). 


Table 4.1 Selecting a pps sample from the population of factories 


Factory 

Number 
of workers 

Cumulative 

total 

Random numbers 
associated 

Selected 

number 

1 

38 

38 

* 1-38 


2 

571 

609 

39-609 


3 

425 

1034 

610-1034 


4 

613 

1647 

1035-1647 


5 

845 

2492 

1648-2492 

1045 

6 

134 

2626 

2493-2626 


7 

218 

2844 

2627-2844 


8 

925 

3769 

2845-3769 


9 

519 

4288 

3770-4288 


10 

675 

4963 

4289-4963 



Suppose a four digit random number between 0001 and 4963 is chosen 
say 1045. This random number is associated with 4th factory and therefore 4th 
factory is selected by PPSWR sampling method. The process continues till the 
desired number factories are selected. 

Lahiri’s Method 

As /V= 10, choose a two digit random number, say 05. Here R = 5 andX 5 = 
size of 5th unit = 845. Next choose a three digit random number not exceeding 
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925, since M- 925. Suppose the chosen random numbers = 742. As S<X 5 , the 
5th unit is selected. To select the next unit, choose another pair of random 
numbers (/?, S) say (7, 438). Now X-, = 218 and S = 438. This pair of random 
numbers is rejected since S > Xi. Then, we proceed to choose next pair. The 
process continues till the desired number of units are selected. 


4.2.2. Estimation of Population Total/Mean 

Denote y as the character under study and the variate values attached to A units 
as Y i,. ..., Y n and the probability of selecting ith unit as P\ with 

/V 

1=1 


Suppose (y„ p,) are the sample observation and the initial probability of 
the unit selected at the ith draw, / = 1, 2, . . n. Now, z/ =>',//?„ is a random 
variable taking N values Z,= Yj/Pi, i= 1,2,..., A with probabilities of selection 
P„ (/ = 1,2,..., AO- ZV s are independent. 

An unbiased estimator of population total Y in unequal probability 
sampling with replacement (UPSWR) or probability proportional to size 
sampling with replacement (PPSWR) is given by (Hansen andHurWitz, 1943) 

^ppswr =n~ ] — 

/=i Pi 


Because 


'y,' 


1=1 


n N 


E(Y resm )=n-' Z E \—\= n -' Z ’L\-\ P i = Y 




i= 1 i=l 


Pi 


The sampling variance of T PPSWR is 




^(^PPSWR ) - V W 

V /= i Pi) 




since cov 


r Pi Yj' 


= 0 for i * j 


\Pi Pj) 

^)±i^jr-r ) 2 fli-r 

Hence, we state the following theorem. 


f 


) 


Theorem 4.1: In PPSWR sampling n~' 


v Pi . 

Z-t ,s an unbiased estimate 
i=i Pi 


of 
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f 


population total Y and V 


«-'X- 

V itl Pi J 




/V 


1=1 \ n 


Corollary 1: ]T — is an unbiased estimate of population meanfvvi 


a 

sampling variance 

f » v ^ N 

Y n-'lf- =n ~' X P‘ 

l ,t! Wp/ ) /=i 


With 


/■ 


* -F 


UP/ 




Corollary 2: An alternative form for V(Xppswr) is 


^(fppswR )- 

\nJ ft 


X_X\ 

U F’/J 


/Vote: It may be seen that K(Tppswr) is zero if YJPi = constant. If P, = XJX, the 
pps estimate will have zero variance if the measures of size are such that the 
variate y is proportional to*. 

Thus, if a survey practitioner is satisfied thatjV* is nearly constant, or in 
other words if the regression line ofy on the size measure* passes or nearly 
passes through the origin andy and * are positively correlated, he may advise 
the use of pps sampling in practice. 

In terms of size measures*, we may write 


N 


Y(y ppswr —J £ Xi 


1=1 


( v x 2 
±--R 
\Xi 



- n 


-i 



Theorem 4.2: An unbiased estimator of K(K PPSWR ) i s given by 

/ ~ *> \ 


* ( ft y2- ^ 

Est. T(Tpp SWR ) = ^_L_ w y 2 


V/= 1 Pt 


/«(«-!) 


=X (7-- n 2 /«(„-!) 

/= 1 Pi 


= -tII 


n- 


‘<j 


( 

Zl 

^Pi 


li 

PjJ 




/(*-!) 
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Proof. 


N 


Est. F(y ppswr) = Est.H 


»=i 


Pi ) 


= Est. n~ l 


f N Y 2 A N Y 2 

Y-j--Y 2 = Est.« _1 V —— n~ ] 

Pi J h Pi 


= C-t)£ 2 (y 2 -Est.K(y PPSWR )) 


(-1 Pi 


On simplification, 
Est. V(Y ppswr)~ 


( " y 2 ~ \ 

Y--nY 2 

V/= i P] ) 


/ n(n -1) = T* [ — - Y I /n(n 
iti Kpt 


1 


II 


( 


n*- / /< j 


yi yj 




V Pi Pi) 


/(«-!)• 


Corollary 1: 

V(Y ppswr) = n ~ 1 ^ P, 


N 

I 

1=1 


Y ‘ -Y 


NPj J 


Est. V(Y ppswr) = J] 

i=i 


— -Y 


/ n(n - 1) 


A f Pi J 

Corollary 2: For p, or P, = \/N, V{Y PPS wr) =c 2 y / n = ^(3>)srswr 

A 

Est. V(Y ppswr) = sj / n = Est. FOOsrswr 
An alternative form of PPSWR estimator: Define 

i'pPSWR = X Y i (fi 1 E (fi )) 


, . N 

where / - number of times Uj is included in the sample f = 

i=i 


distribution of/,/,..i s multinomial with £(/) = nP ,, F(/7) 
and cov (f h f) = - nP t P } (/ *y) 


\ 






/ 


f y • 

M • 1 j 


Wfi)E(fj)) 


Est .r 2 


-i) 


«. The joint 
«/>/ (1 -/ > /) 


COv(/-,/y ) 
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f 


«/> (l-P,)-!! 

i*j 


Y Y 

1 1 1 j 

n 2 Pi Pj ) 


nPiPj 




( Y 2 

\ 

N / 

E-r- 
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4.2.3. Comparison with SRSWR 

With x as the size measure V(Y )ppswr = n~ } (X Z Y? / X x - Y 2 ). Comparing 
with V(Y )srswr> it may be seen that PPSWR will be more efficient than 

N 

SRSWR if ^(X i - X)(Y? / X ,)> 0, that is, if x and y 2 /x are positively 
; = 1 

correlated (Raj, 1954). However, such a criterion is difficult to be decided 
beforehand to adopt PPSWR sampling. 

If near proportionality does not exist between;/ and x, there may arise 
situation when PPSWR would be worse than SRSWR. For instance, when;/ = a 
+ bx, PPSWR sampling will be less efficient than SRSWR if 

(1 lal)(X/X H -1 )>b 2 /a 2 , 

where 

X H = N/ [El IX] and o 2 =l ( Xi-X) 2 /N 

The above condition is likely to be satisfied if a is sufficiently large, that 
is, if the regression line is very far away from the origin. Under such 
circumstances Reddy and Rao (1977) have suggested a PPSWR estimator with 
a transformation on the auxiliary variate x, so that the modified PPSWR is 
found to be highly efficient in most of the situations met in practice. In this 
connection the following theorem due to Reddy and Rao (1977) is stated 
omitting the proof. 

^ _ 

Theorem 4.3: Let Y = Ny be the usual unbiased estimator of Y in SRSWR and 

X J n v* 

Y ppswr = — y — be the usual PPSWR unbiased estimator. Define 
n /=i Xi 

X'i = X ( + dX , for / = 1,2 ,..., N, 

where d is any non-negative scalar. Obviously X = X(\ + d). The modified 
unbiased PPSWR estimator 

Y 'ppswr = — Ty // ( x t +dX) 

» t t 


Then, F(K) PPS wr < W + d)] V{Y)+ [1/(1 + d)} F(y) PPSWR . 
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The above theorem shows that T 'ppswr is better than the worse of Y and 
y Tpswr- The non-negative scalar d is chosen such that V (T 'ppswr) is 
minimum. This is achieved by choosing^/such that the regression line ofy on 
the transformed variatex' passes through a point close to the origin. Following 


Reddy (1974) the right choice of d is d= d* = (1 - k)/k, where k =(3/7?; (3 is the 
regression coefficient ofy on x and 7? is the ratio of population total ofy to 
population total ofx. Also, it is necessary that*, + dX > 0 for all /. Then, for d = 
</ = (!- k)/k, denoting the estimator as Tp PS wr 


1 ( N 

T(Fp PSWR ) = — X 

n Vi= i 


Y 2 X 


kX, + (l-k)X 


-=-Y 


The knowledge about k is generally not available. However, from previous 
surveys on the same population some good guessed or predicted values for (3 
and R and hence for k may be obtained to be used in the modified PPSWR 
estimator. 

Wheny and x are negatively correlated, the transformed variatex' is such 

that 

X] = dX-Xj, i= \ ,2,... N, 

where d is such that dX> maxAT. For the choice of d* = (k - 1 )/k, the regression 
line ofy on jc* (where X • = d*X-X) passes through the origin. 

Assume that the finite population values (Ti, Yj, . . Yn) is a random 
sample from an infinite population following the modelT, = pjf, + e h E m (€/ j X) 
= 0, V,„ (e,\X) =aXf ,g>0<mdE m (e i e ] \X i ,X J ) = 0(i*j),i*j= 1,2,.. .,N. 

Thus, 

Ern n^PPswR) = (a/n) (XZ Xf~'-Z Xf) and 

E m K(K S rswr) = ((N - 1)/«) (a I Xf + |J 2 N S |). 

Hence, PPSWR estimator will be superior to SRSWR estimator if 
p(x, xT 1 ) - (N - 1) |3 2 S x / Na S [ g_1) 

where p is the correlation between x and xS -1 ; S x and are standard 

deviations ofx and xt~ ] respectively. Hence, forg > 1 the PPSWR estimator is 
always more efficient. 


4.2.4 Estimation of Gain Due to PPSWR Sampling from a 
Given PPSWR Sample 

A 

Now, F(y S RswR) = N 2 a 2 In = n 1 (N I Y 2 - Y 2 ). Est. F(K S rswr) from a PPS 
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N 

sample can be derived by finding out estimates of £ Yf and Y 2 . Thus, we 


have 


( N } 


Est. 


y* -It*- 

' J n t\ Pi 


and 

Est. (Y 2 )=Y 2 ppswr -Est.F(T PPSW R ) 

« y 2 

Thus, Est. F(Tsrswr )=^«~ 2 X ~ -w_1 f 2 ppswr +« _1 EstK(Tp PSWR ) 

i = l Pi 

Therefore, the gain due to pps sampling estimated from sample 
[Est. F[Tsrswr)- Est. F(Tppswr)]/EsL F(Tppswr) 


>v „ 1 ( y? ^ 

Note: Est. F(Tsrswr)- Est. T(Tppswr) = — J] — 

« \PiJ 


v Pi- 


4.2.5 PPSWR Sampling with Two Auxiliary Variables 

Suppose we have information on two auxiliary variables*! and *2 leading to 

N N 

selection probabilities P, and P\ with ^ P x = ^ P\ = 1 

1=1 /=i 


Let P" =aPj +(l-a)P{ ,0<a < 1 


Now 


f(fpPSWR)- w ' 


^(fpPSWR)~ w 1 


^(fpPSWR)-^ 1 



T. J. Rao (1992) has shown that 

^(fpPSWR )- a f(fpPSWR ) + (l - Ot)^(fppsWR ) 

It may be observed that if V(Y)<V(Y f ), then Y(Y")<F(Y') and if 
y(Y ') < V(Y \ then V(Y" )< V(Y ). This shows that the selection with p "is better 
than worse of selection with p t and p\. Tripathi and Chaubey (1990) have 
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considered use of multivariate information as size for obtaining optimum 
probabilities of selection in PPSWR sampling. 

4.2.6 Alternative Estimators in PPSWR Sampling for 
Multiple Characteristics 

In large scale multipurpose sample surveys one often comes across many study 
variables and one needs to construct estimators for a characteristic that is not 
strongly related to the probabilities used in selecting the sample. That is, the 
chosen size measure may not be related to some of the study variables and in 
some cases may be poorly correlated. For example, as pointed out by J.N.K Rao 
(1966a, b) in a sample survey for obtaining farm facts, the poultry population in 
a farm may not be related to size of the farm. In such cases the usual unbiased 
PPSWR estimators for the characteristics of study variates poorly correlated 
with size variate may have large variances. Under such circumstances J.N.K. 
Rao (1966a) has provided alternative estimators, which are more efficient, 
although biased. Ify and* are unrelated and the sample is drawn withpps to X, 
and with replacement, the alternative estimator of population total Y is given by 

^PPSWR = f—1 X Yi in P lace ° f ^PPSWR = “J X —■ 
v ^nJ I= i Pi 

Now, £(Tp PSWR ) = N^ j YiPj 

«=i 

and Bias (Y^^N -Y. 

1=1 

This shows that if y and x are uncorrelated the bias would be zero and if 

A 

the correlation is small, the bias is bound to be small, although Kp PSWR is not 
consistent. 

Consider the super population model K, = p + e„ (/ = 1, 2,.. . N) with 
E m (e i \X i )=0,E m (ef\X i )=a and E m (e,e y \X y ) = 0 i,i* j. 


A 

Now, E,„ V(Y ppswr ) = 




- N 2 N 2 JL 

n n 

Hence, E m V(Y ppswr) - E m V(Y* ppswr) 
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which is always positive, since LP,~ ] > N 2 and ZPf > 1 IN. 

This shows the superiority of the alternative estimator over the usual 
PPSWR estimator under the super population context. 

A ^ 

An unbiased estimator V(Y* ppswr) is given by Est.K(T* PPSWR ) = 

H ( y. — y a 

N 2 y ——which has the same form as that of Y in SRSWR. Bansal and 
itl n(n-l) 

Singh (1985) considered an alternative estimator of the population total for 
characteristics that are poorly correlated with selection probabilities given by 

Ybs =(!/») 


where p* = (1 + 1 /TV) 1 p (1 + p ,) p —1, p being corr (y, x). Amahia, Chaubey and 
Rao (1989) proposed a simpler and alternative estimator as 


A 


V(Y 


PPSWR 


«,=i 


/ 


where/?;* = (1 - p) (1/AO + PA, /= 1,2,. .., N. 

Suppose in a multipurpose survey information on two size variables*] 
and *2 are available to estimate the population total Y\ and Y 2 of the characters 
y\ and^ 2 respectively. Suppose further that a sample of size/? units is selected 
with ppswr to x\ which is strongly correlated withyi but not withy^- Let pn = 
X/j/X I and p 2 , = X 2l IX 2 . 

Consider the following estimates for the population total Y 2 as 


To 


N_ 

n 


Z y2i- • 


i= 1 


T, 



If- 

i= 1 P\i 


and 



The estimators 7o (biased) and T\ (unbiased) in PPSWR have been 
discussed by Rao (1966a, b). Like 7o, the estimator T 2 is also biased (constant 
bias), but the bias can be removed by Hartley-Rossization (Lanke, 1975) and 
thus we have 


h 


= _1 V flL 4 1 
n {f\ Pn n -1 





\ 

P2i 

P\i) 


It may be seen that ify 2 / = ^ 2 /, 7b coincides with Y 2 . Thus, 7b is expected to 
have small variance if y 2i is nearly proportional to x 2i and therefore an 
appropriate estimator of Y$. 



Unequal probability sampling | 63 


4.2.7 A General Result for the Derivation of Mean Square 
Errors and Their Non-negative Unbiased Estimators 

The usual method to derive the mean square errors of linear estimators and their 
estimators often results in tedious algebra and tricky manipulations. J.N.K. Rao 
(1979) has established an ingenious theorem which is stated below without 
proof for reference. 

Theorem 4.5 (J.N.K. Rao, 1979): Based on sample s with associated prob¬ 
ability p(s) define a general estimator of population total Y as 

„ N 

i=i 


where d, s = 0 if the unit is not included in the samples, and the weights do not 
depend on the Y,. 

Suppose Y is such that MSE(T) is zero for T, = CW, (i = 1,2,..., N) where 
W, are non-zero constants and C is also a non-zero arbitrary constant. 
Then, (a) MSE(T) can be written in the form 

MSE(T) = - S I du W; Wj (Z,- -Zj ) 2 

\<i<j<N 

where Z, = Y,IW, and d y , = E(d t , - 1) (d p - 1), 

* %P{s)dis d js ~J j p(s)d is -^p(s)d js +1, i<j. 

S3 is j S3i S3j 

(b) The non-negative quadratic unbiased estimator ofMSE(K) is necessarily 
of the form 

Est.MSE(T) = -1 Z dy ( s)W t W, (Z f -Z, ) 2 

i<7 


where d,j(s) does not depend ony, and d,j(s) — 0 if s does not contain both / 

and j anc ^ E(dij (s)) = ^ P{s)d t j (s)= d yj (i < j\ the unbiasedness condi- 
tion. S3 ij 

m m 

Note 1. Let x ‘ x i be a non-negative definite quadratic form in* 

»=i <=i 


( i’ 2 ->- x m ) with Z Z at, 0. That is, Q - 0 when x's are equal. Then it 

m 

follows that y a i; = 0 for all / = i o 

' T 2 ,..m and one may write 


2 . 



The evaluation of MSE(P) depends 


only on t/j, which in turn depends on 
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the sampling design p(s). 

a) If Y is unbiased, £ p(s)d is = 1, for all /, and as such 

Ml 

dij = J^p(s)d is dij -1 ,i<j 

S3 i j 


= E(d is d p )-\J<j. 


b) If d, s is independent of s, that is, d is = d h for s 3 /, dy = d,dj Tty - djit, - d/itj 
+ 1. Again, if condition of unbiasedness is imposed, = l/ 7 t„ 


1' = £—andrf tf = 


US ^ t 



V Tt Tty 


\ 

-1 

y 


3. 


4. 


With Wi =7t f -,MSE(y )= Z Z(7t f it j -Tty-) 

«<7 


r K- 1 2 

j i _ 1 1 
V Tt / Tt y y 


c)lfd ii =d,i<j ,MSE( y)=-dZXW i Wj (Zi -Zj) 2 

There may be many choices of d,j(s) in non negative quadratic 
estimator of MSE (Y). A straight forward choice for d,j(s) is dyhiij or a 
general one such as d„(s) = dij ay(s)/E(a’ 0 (s)) may be taken into 
consideration. 

Some unbiased quadratic estimators of M.S.E (Y) where Y is unbiased, 

A 

may also be derived by expressing M.S.E. (K) as 
MSE(T) = - Z Z d u Wi Wj (Zi - Zj ) 2 

1*7 


=-XX Wi Wj (Zi - Zj y (E(d is d js )-1) 

1*7 

= -IXgij (1 -hy)= XXgij - X Xgij hjj , 

1*7 1*7 1*7 

where g,, = W, Wj (Z, - Zj ) 2 and h tJ = E(d ls d js ). 

Thus, we write 

Est.MSE(T) -XXgyay-XXgy hy. 

i < 7 i < 7 

(Vijayan, Mukhopadhay and Bhattacharya, 1995) 

The choices fora,y and are 
=d i ,d is /E(d is d Js ) 
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6 . 


a (2) = 1 1 M 2 p(s), where M 2 = 




r N - 2 

Kn-2) 


af = 1/71 ij, aj. 4) = p{s\i,j)/p(s), etc. 

ii) hf =d is d js ,hf = hy / M 2 

hf =h t j = Ay P(j|/?(•*),etc. 

Alternatively, if Y is unbiased dy = £(<4 d/.v)~ 1, which may be expressed 
in the form 


Z p(s)dij(s)= £ PW 


d is d js 


fij (s) 
p(s) 

where f/(s) is chosen such that ^ fij ( s )= 1, i<j- 


ssi, / 


SSI, J 


SSI J 


Equating term by term gives 
dij (•$)- di s d j s ~{ k fjj (■?)/ 

For a special case, when d is - d t for i E s, and fj{s) =p{s)hiij, dijS ) reduces 
to d,j(s) (ti jj 7i j k j ) / 7i ij 71 j 71 j , i < j ej . 

Expressing F(K) in the form F(T) = ^a,-,- F 2 Y { Yj, some 


**7 


possible estimates, which may not satisfy non-negativity requirements, 
are 


V/ = !• 


« Cl;; Y 2 


2- + XI 

n ■, i*.i 


" a ij yi Yj 


71, 


V, = 




where M,- = 


M\p(s) 

'N-i' 
\n-i) 


M 2 p{s) 

, / = 1,2 


Application to PPSWR 

y = (l/«) I fy,/p h where7/ is number of times z'th unit occurs in the sample. 
Choosing 

^//(^) ~ d is dj S , 

we have d^s) = d„ [ (d ls d jx )/E(d lx d jx )\. Thus, for PPSWR, since d lx = ftnp t 
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1 

1 

/ n(n - 1 ) 

1 

1 

1_ 

n 

_ n 2 pi pj / 

' « 2 

n 

n(n- 1 ) pi pj _ 


Therefore, Est. V(Y )=-X X/y (s)W, W ,• (z, -z y ) 2 

i <./ 


1 


« 2 (/7 - 1) /'< j 

1 


II/// 


T/ T/ 




Ip/ p,-J 


, ,T/- 

«(/7-l) /= i V/?/ / 

the usual estimator of variance inPPSWR sampling. Further, if we choose d,j(s) 
= dtjn,!, i < j E s , another non-negative variance estimator is obtained as 


A 2 


- 1 n D: P : 

Est V(Y)=-I J I J ' 


f 


n i < j n 


Yi Yi 


\Pi PiJ 


where n u = 1 - (1 -/?,)"- (1 -/?,)" + (1 -p t -pj) n . 

4.2.8 Estimator Based on Distinct Units in PPSWR Sampling 

Consider the problem of estimation of population total Y based on data ct 
obtained from an ordered sample of size/? = 3, drawn with unequal probability 
with replacement, the probability of selecting/th unit being/?,. Let 

),(h,yh ).(< 2 ’)’i 2 )|^*) 


where i\ and ij are labels of the units drawn at the 1 st, 2nd and 3rd draws andy^ 
and y h are the values of the characteristics attached to these units. In the present 
case there are two distinct units, that is, u = 2 and hence 


d = ((i i ,y l i )Ah,Yi 2 )k) 


We already know that Y= MnlLyJp,. based on d* is an unbiased estimator of Y. 
Applying Rao-Blackwell theorem an estimator based on sufficient 

A I A 

statistic d, that is, t = E(Y ppswr I d) is uniformly better than T PPSWR . Explicitly, 


(i) 

f 2 * + 'C 


1- 


l Pi, Pi 2 J 


Pi,Ph(Pi + Ph ) 


( l ) 

[ y 4 

+ 2 h> ) 

V32 


Pi 2 J 


Ph P\ 

p\Ph(Pk + ph ) 
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->7, , y<-2 , .v/, + y* 

Pi t Pi 2 ( Pi 2 + Pi 2 ) 

Pathak (1962b) has given the form of the estimator for any/?. However, the 
form is too cumbersome to be computed in practice. 

Rao-Blackwellization implies that one will have to search for the best 
estimator among the class of estimators, consisting of an infinity number of 
estimators based on sufficient statistic d. As the sufficient statistic d is not 
complete (Cassel, Sarndal and Wretman, 1977), it may not be possible to find a 
best estimator. 



4.3 Sampling with Unequal Probabilities without Replacement 

4.3.1 Introduction 

As we have seen earlier in sampling with unequal probabilities with replace¬ 
ment or ppswr sampling, the selected unit is not removed from the population 
after it is drawn. However, one may also sample units with unequal 
probabilities without replacing the selected units. Such a method of sampling is 
called unequal probability sampling without replacement (UPSWOR) or 
probability proportional to size without replacement (PPSWOR) sampling. It 
is also observed that the estimate of population mean or total in equal 
probability sampling without replacement (SRSWOR) is always more efficient 
than equal probability sampling with replacement (SRSWR). As regards 
unequal probability schemes, the estimate based on a sample drawn without 
replacement may not always lead to a more precise estimate compared to one 
based on a sample drawn with replacement. Moreover,PPSWOR strategies are 
usually very complex in nature and are not preferred in large scale sample 
surveys with small sampling fraction, since in such cases the difference in 
efficiencies between PPSWR and PPSWOR sampling is not appreciable. In 
spite of these comments, a large volume of literature inPPS WOR sampling has 
been produced since 1950. 

Suppose n units are to be selected from a population consisting of N units 
WIth pps tov (S'ze measure) at each draw without replacing the units selected in 

se ect n 'r l raWS U ‘ ° r P ‘ = be the Probability of 

,v g the lth unlt u > of ‘be population at the first draw (i= 1,2,.. .N) so that 

I Pi= 1 Then, the conditional probability of selecting any other unit say Uj at 
‘he Waw, given thatith unit U, has been drawn at the first draw is*,,,=„/(, 

Then FUrther ’ '** P ‘ tn be the P robabili ‘y of selecting ith unit at the rth draw. 
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N 

P>(\) = Pi> p n 2 ) = Z (Prob. of drawing j th unit at the 1st draw) 


x (Prob. of drawing i th unit at the 2nd draw given that 

j th unit has been drawn in the 1st draw). 

N n . N 


pj 


j*f l ~Pj 

N 


=Pi z 


Pi 


jti 1-Pj 
Pi 


'(2) ,s 


IS 


J-tl-p/ l-Pi 

It may be seen that P i{]) * P j(2) for every i, unless p, = \/N and further P i{ 
proportional to Pi ( \ } or p,. ( 

denotedby = ^ thC Pr ° babi,ity that ith unit U <- is inc luded in the sample 

7T, = Prob (either the /th unit is drawn at the 1st draw or the/th 
unit is drawn at the second draw) 

Pj Pi 


P i{\) +P i{ 2) = Pi + 2' 

j*ii}~Pj) 


( 


= Pi 


N 


i fii(\-Pj)) 




= Pi 


N 


f 


Pi 


i+y 

{ faW-Pj) (i -Pi)) 


Pi 


Further, the probability that both (/, and U t are in the sample of size /7 = 2 is 
given by 


71 >1 Pi (PjPi) + Pj ~ Pj) ~ PiPj {pi/\ — Pi + Pj/\ - pj) 

For a sample size n greater than 2 and also for many sampling schemes the 
formulae to calculate n h n,„ etc. become messy and complex. However, as a 
general rule the probability that any unit U, is selected in the sample can be 
obtained as sum of the probabilities of all ordered samples of size/? containing 

Uf 

In sampling theory literature different sampling schemes hav§^been 

proposed by taking into account either the order of the draws or inclusion 
probabilities or other methods. 

The difficulties of using P,( r j s or forming estimators based on order of 
draws stem out of the fact that the probability of selecting a particular unit 
changes from draw to draw, resulting in the change of expected value of the 
variate yjP l{r) with successive draws. An ingeneous technique of overcoming 
this difficulty of changing expectation with each draw is to define the new 
variate for each draw in such a way that its expectation is equal to the parameter 
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in question (Raj, 1956). As the estimates take into account order of the draws, 
such estimates are popularly known as ordered estimates . The ordered 
estimators depend on the order of the draws and make use of conditional 
probabilities without explicitly calculating tc, and 71 y which at times become 
difficult for some sampling schemes. 

4.3.2. Ordered Estimators 

Among the ordered estimators discussed in the literature one such ordered 
estimator due to Raj (1956) needs special mention because of its some nice 
properties. This procedure makes use of conditional probabilities without 
calculating n, and n,, and calculates expectations invoking conditional 
aigument. Let us consider a without replacement sampling of two units 
from a population consisting of N units (U h U 2 , . . U N ) indexed by the 
values of a size variable* denoted by (X\,X 2 , .. X^to estimate the population 
total Y of the character The first selection or draw is made with probabilities 
p t - (Xj/X), i = 1 , 2, . . ., N and the second selection or draw is made with 
probabilities proportional to sizes of the remaining units. Therefore, the 
conditional probability of selecting Uj at the second draw given that U, is the 
first selection is given by p,/ (1 -pi). 

Let and 3^2 be the values associated with the first and second draws and 
P\ and p 2 be their corresponding initial probabilities respectively. Define the 
estimators 

t\ =y\!p\ 

and 


h y\ + yil{pil\ -p\)-y\ +y 2 (1 -p\)/p 2 

Now 


E(t 1 ) = 


N 


z 



Pi 


= Y. 


and 


£(' 2 ! t\)~y\ + E(y 2 (1 -p\)lp 2 | t\)=y] + Y—y\ = Y 

Hence, 

E(h) = E(Y)= Y 

Thus, both t\ and t 2 are unbiased estimators of Y. Further, 
cov (/ 1 , t 2 ) = E(t\t 2 ) - E(t\) E(t 2 ) 

= £[£('1/2)1'.]-T 2 

= E[t\ E(t 2 1 fi)]- X 2 = E{t\Y)~ Y 2 
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= YE(t ] )-Y 2 =Y 2 -Y 2 = 0, 

so that t\ and 6 are uncorrelated. Now an unbiased estimator of the population 
total Y is given by 

Yd = ~t = y 2 (t\+ t 2 ) = >2 [OVPiX 1 + Pi) + Wp*) ^ “ p 

and 

/ \ f * \ ? 1 . W ^ 1/ 'N 2 

1--Y p. 2 
V 2 ,=, y 




ifKH Ms* 


/ 

U 


k-y 


J 


As ppswr) “ ^ X I -r- ~ ^ 
2 i=i VP,- 


yv 


Yi 


for « = 2, it may be easily verified that V{ 

Y d ) is less than F(F P pswr). An unbiased variance estimator of Y D is given by 
Est.K(}^) = (%) (r, - h ) 2 = (X) (1 -Pi) 2 Oi/Pi ~> ; 2 /P 2 ) 2 - 
For pps sampling and with n = 2, 

Wo) = (X) I Z i Zj(2-Z i -Z j )(Y i IZ i -YjlZ j ) 2 , 
i<j 

where Z,=XjlX , and Est. F^) = (X)(1 -^ 2 ) 2 . 

For general «, let the values of the elements of the sample, in order of 
drawing bey^,^,.. .,y n withpi,p 2 > ■ • Pn as initial selection probabilities. 
Now define Raj’s ordered estimator of population total as 

/ p n 


y d = 


i = , 


where 

'/ =y\ +y2 +• • • +f/-1 + 0 //p/) (l -pi -P2 -... - p, -1). 

A 

As before, it can be seen thatZ^y^) = K. 

For general n, it does not seem to write V(Y D ) in a concise form, However, 
Pathak (1967) has expressed V(Y D ) in the form 

f r . .. \2 

f / (Y D ) = n~ 2 I ZP.Pj 
i<j 


y t 

Y A 

1 j 

Pi 

’ pj) 


+ S2,j('-). 


r = 1 


draws 2#) ^ ,he Pr ° bability that the units '' and J are not selected in the first 
An unbiased estimator of V(Y D ) is given by 
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Est. V(Y D ) = T.(ti-0 2 /n(n- 1). 

It may be seen that V(t n ) < V(t n _ i) < . . . < V(t\) and further V(Y D ) is 
smaller than the variance of usual estimator (l/«) I yjp, in ppswr (Raj, 1968). 

Another important ordered estimator which is said to be the forerunner to 
that of Raj (1956) is due to Das (1951). A draw back with this estimator is that 
the unbiased variance estimator can sometimes assume negative values. 


4.3.3 Unordering of Estimators 

Does there exist an unordered estimator corresponding tc every ordered 
estimator? If so, is it more efficient than the ordered one? Murthy (1957) has 
shown that corresponding to any estimator based on the order of selection of 
units, there always exists an unordered estimator which can be obtained by 
weighting all possible ordered estimators by their respective probabilities of 
ordered samples. For instance, consider Raj’s estimator for a sample size« = 2, 
with the values of the units in the sample^] and >>2 and corresponding initial 
probabilities of selection p\ and pi. Raj’s ordered estimate corresponding to the 
ordered samples* = {y\,yi) is given by 


(1 + Pi )- + (!-/>, ) — 


W)=T 

2 L Pi Pi 

and corresponding to the ordered samples, = (y 2 ,^i) is given by 


Y d U'i) = \ 
2 


(i +/7 )Zi + (i_p )Zl 
p 2 P\ 


The probabilities of order sets s* and s * 2 are p{ s*) = P] p 2 /(\- px ) an d 

/?(■':) = pip\!{\ - pi). Thus, the unordered estimator as propposed by Murthy 
(1957) is given by 


M 


_{Yd (-*1 *) p{s i *) + Y d (s 2 *) p(s 2 *)| 

fpcvTTte*)} 

1 


A 2 ~Pi -p 2 ) . 


(i- P2 )il + (i_p ) yi_ 
pi p 2 


with 


v (Y.u ) = X II — /(> P ‘ Pj) 

"/ ( 2 -Pi-Pj) 


f 


V 


h 

p. 




\ 2 


and 
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Est. V(Y m ) = 


(1 -Pi-pj )(1- Pi )(1- P;) 


( 


}',■ y i 


\ 2 


(2 - Pi - P)? \Pi PjJ 

For a general n, Murthy’s unordered estimator may be put in the form 

1 

P(s)\ /= i 


Ym = 


YjyiP(s\i), 


where p(s) is the probability of getting the unordered samples and p(s \ i) is the 
conditional probability of observing.? th sample given that the /' th unit has been 
selected in the first draw. 


N 


v(y m ) =y 2 izp i p i 

i*j 


l- S' 

S3 i, j 


p(s i)p(s\j)i 

(yi i'll 

P(s ) ^ 

U Pj) 


wnere I' denotes summation over all samples containing/ and j and its 

S3 i. j 


unbiased estimator is given by 


Est. V(Y M ) 


- XII Pi Pi 


(a 

\Pi 



[p(i)p(s| i,j)-p(s\ Op(i| j)] p(s)~ 2 , 


Pathak and Shukla (1966) have shown for all n, Est. V(Y M ) is non-negative for 
ppswor sampling. 

That an unordered estimator is more efficient than the estimator based on 
ordered sample can be proved by the application of Rao-Blackwell theorem 
(Basu, 1958). This assertion was proved by Murthy (1957) without appealing to 
the concepts of sufficiency and Rao-Blackwellization applied to survey 
sampling. 

Murthy’s technique can also be used to form unordered (also termed 
symmetrised) estimators from other ordered estimators proposed by Das 
(1951), Sukhatme (1953) and others. 

4 . 3.4 The Horvitz-Thompson Estimator of Population Total 

The superiority of equal probability without replacement sampling over the 
equal probability with replacement sampling because of the presence of 
repeated observations in the latter encouraged the researchers to look for an 
unequal probability without replacement sampling (UPWOR) with a view to 
obtain more efficient estimators than those based on Hansen andHurwitz’s 
(1943) PPSWR sampling. Ultimately this led to the formulation of a path 
finding generalized estimator by Horvitz and Thompson (1952) for the 
population total. The Horvitz-Thompson estimator is essentially a l u estimator’ 
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(Sarndal, Swensson and Wretman, 1992), defined with the help of inclusion 
probabilities and uses principle of '"n expansion’ to provide an unbiased 
estimate of the population total. In the words of SSW (1992) “the7i expansion 
has the effect of increasing the importance of the elements in the sample. 
Because the sample contains fewer elements than the population, an expansion 
is required to reach the level of the whole population. The£th element when 
present in the sample, will as it were represent lAt* (n^ being inclusion 
probability for the k th unit) population elements”. 

For a fixed sample s of size n with an arbitrary sampling design/?^) the 
Horvitz-Thompson estimator of population total Y is defined as 

N 

Y HT = X j , with n, > 0 for all i e U, 

i=\ 

where /, is a sample membership function defined as/, = 1 if i e and = 0 
otherwise. 

Thus, /, is a Bernouli random variable having 
E(h )= X P(s) = 71 , 

S3 i 

£(/, 2 )=7I,-,£(/,/,)= Y J P(S)=%„ 

(/,./) 


V(I,) = 7t, (1 - 71,) and cov (/,, / ; ) = n l/ - n, n, 

An alternative form, which is generally used to compute the estimate is 

Yht = X~ = 2 n ' > 0 f° r all 1 <i<N. 

ies ^ i i= 1 ^ / 


Narain (1951) had derived independently this estimator on intuitive grounds. 
Now, 


Wtrr) = V 


N 


hYi 


i= 1 71 i 


N ( y 2 ^ 

=£|-V ^(//)+iz 


V f r ( .y 7 - ^ 








CO v(IiJ j ) 


N f 

=1 £ 
i=i w 


N 


7C,-(l-lC,-)+II 


1 1 1 j 


l *> \ n iK j J 




This form of V(Y HT ) is due to Horvitz and Thompson (1952). An unbiased 
variance estimator of Y HT is then given by 
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Est y(r HT )=Y, 


n f m2 \ 


yt 


f 


r=i v r y 


(1-7T, )+ Z Z 

i * i 


y*yi 




\niKjJ 


(Tl ij - 71,71 j ) 


n, 


The proof can be established by using sample membership functions/,’s. 
Further we write, 


N N 


HJW)=£2> v (/,//) 


/ / 


(yT 

\n,J 


f yj' 


U /J 


= X S A ii( y i Y j / n i n i )’ where= cov 7 /) = % “ ^z 71 / 

V ? 

r/ >0 


' / 




l / 




= “>2 £ 


U/ Tl/J 

y, yy N 2 


, using Z A u = 0 




U/ 7t/y 


= -ZZA,y 

i< i 


Yi y .i 


U, * j J 


Z Z (ft /7T j 7t ,y ) 

' < j 


fY. 

- ¥j } 


n j J 


an expression which is due to Sen (1953) and Yates and Grundy (1953). An 
alternative unbiased variance estimator of V(Yht) when variance is expressed 
in Sen-Yates and Grundy from is given by (proof is simple) 


Est.sYG V(Xht) ~ - Z Z 

i < i 


1 - 

I 

(»y/) 

1 

_1 




Remarks: 

1 A set of necessary and sufficient conditions for theestimability of Y in a given 
design p is that n, > 0; / < / < N. When these conditions hold good, an unbiased 
estimator of Y is the Horvitz-Thompson estimator (Godambe, 1955). 

2. Both the HT and SYG variance estimators require that 7 r, > 0 for all i e 
U and n,f > 0 for all i*j e U. However, for any givens, p{s) is bound to be 
positive, and thus n„ > 0 for i*j es whereas the stronger argument^,, > 0 for 
all i *j g U may not be satisfied and hence attempt to form variance estimators 
in such cases may lead to misleading results. 

3. Sen-Yates and Grundy form of variance expression has an interesting 
explicit implication in the sense that V(Y HT ) SYG = 0, if tt, is proportional to Y,. 
However, this is an utopian situation which may not always be achieved. In 
case an auxiliary variable or size variable x related to survey variable^ is 
available, one can think of a fixed sample size(«) design where n, is 
proportional to X, for all /, provided X,'s on x are known in advance, resulting in 
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substantial reduction in variance. Fixed sample size(«) designs, wherni, = n P h 
(/ = X,/X) are called IPPS (inclusion probability proportional to size) or nPS 
designs. Therefore, if an auxiliary variable* is available that is approximately 
proportional tojy, the HTE estimator based on nPS designs achieves high level 
of efficiency. Of course non-negativity of the variance estimator is another 
aspect, which should not be lost sight off. 

It may be noted here that ify-values are not related to the inclusion 
probabilities, the variance of the Horvitz-Thompson estimator can be large. 
Basu (1971) has given an interesting ‘Circus Elephant example’ where the HT 
estimator performs badly because inclusion probabilities are poorly correlated 
with v-values. This calls for the recommendation of a generalized ratio 

A 

estimator Y H = (Z-y, / 7 t ( )/ (Z1/ 7 t) as an estimate of the population mean Y 
(Hajek, 1971). 

4. Depending on the sampling designs and configuration of (K 1 , Y 2 , . . 
fv), the variance estimators of HT andSYG may take negative values for some 
samples. SYG is non-negative if Ay < 0 for all i * j, that is, ny < HjKj for all i * j 
(a sufficient condition). Biyani (1978) has proved that a necessary condition 

A /V 

for HT estimator to be non-negative is Est htV(Y ht ) = EstsYG V(Y ht)- For 

A 

given data it may be possible to get negative value of EstsYG V(Y H t) an d a 

A 

positive value ofEst.HT F(f ht )- 

Based on empirical and limited theoretical evidence, SYG variance 
estimator is generally considered to be superior to HT variance estimator 
because SYG variance estimator takes fewer negative estimates and has smaller 
sampling variance (Cumberland andRoyall, 1981 and Rao and Singh, 1973). 

A A 

When YJn,' sare equal for all population units, E(T //r ) = OandEstsYG ^(Y ht) ‘ s 

A 

zero, but Estm V{Y HT ) does not equal to zero and thus being an unbiased 

estimator it must be capable of assuming negative values. The efficiency of HT 
estimator is expected to be low when the variation betweeny/rc/’s is small 
relative to their magnitude and there is considerable variability among theft/s 
(Yates and Grundy, 1953). For populations for which y t is nearly proportional 
to 7i„ that is the case where 7rPS sampling is very efficient, it is conjectured that 
EstsYG V(Y ht )may have smaller variance. 

To get better insight into the structure of variance estimators let us write 
(Stehman and Overton, 1994) 

Esi HT '/(Y HT ) = Y,y? (1 y\ y'j 

/=1 **J 

where/, = yjn, and Wy = (n,Kj - ny)/ny, and 
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Est sra KYht) = 11 Wy y'. 2 - 11 W g y\ y) 
•*J i*j 




\ 

fv tj -II w v y t y, 

7 l *J 


Comparison of variance estimators shows that the second terms in two 
estimators are the same. The first term in HT variance estimator has the 
coefficient (l-7i,) for y'. 2 whereas in SYG variance estimator the coefficient of 
y ( ~ is a random variable having conditional expectation equal to (1 -n,). 
Further, 


Est H -, v(Y HT )=Y l y'i (l—it i)yi~Y l w g y) 


Est.NTO V(i HT ) = £ y'i Y, (y'i - y'j ) 

1 = 1 j*i 

Then, Stehman and Overton (1994) concludes. “If IT//is relatively homo- 

A A ■* 

geneous, Est//y V(Y HT ) and EstsrG Y(Y HT ) are virtually identical. When W tJ is 
variable, they will differ most when variability ofy-s is small. The latter 
condition is satisfied wheny and x are highly correlated and the regression line 
passes near or through the origin, which is the condition when the sampling 
with varying probabilities is highly efficient compared to simple random 
sampling. If the variability iny' is moderate to large, the advantage ofEst.^' 

A 

V(Yht) * s expected to be small”. These conjectures have been confirmed by 
empirical studies performed by Stehman and Overton. 

5. As regards stability of HT and SYG variance estimators it may be 
pointed out that if the estimates for some samples take inordinately large values 
because of veiy small values of n,j, leaving scope for low values for the 
estimates with other samples (as variance estimators are unbiased), it is natural 
to expect large sampling fluctuations for the estimators, leading to poor 
stability unless small values of7t,y are not permitted to occur. Looking at the 
structure of the formulae for variance estimators of HT and SYG one may 
imagine that the estimators possess good stability if ny/nflij varies within 
narrow margins and does not drop too far below unity ^haudhuri and Vos, 
1988). Keeping in view the difficulties encountered in using Horvitz- 
Thompson estimator, Hanurav (1967) recommended that a sampling scheme 
for use with HT estimator should satisfy the following conditions: 

a) sample s of size n is fixed and for all s , p(s ) > 0. 

b) 7t, = npi = nXJX 
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C) 71,/ >0 

< n ' n i 

e) 71 , /( 7 C/ 7 T,) > P not close to 0 (/' * j) and P > 0. 

f) easy computability of 71,, for all (/' * y). 

6 Raj (1968) has shown that the sufficient condition for the without 


replacement estimator £ .y,- / 7t ‘ 

n 


to have smaller variance than with 


replacement estimator^ y ( / np { with 71 , = np h independently of the/s is 


that n„ > [(«- l)/«] 7i/7t/, for all /,y e (7. However, a necessary condition 
obtained by Narain (1951) for the superiority of WOR estimator over WR 
estimator with tt, = np h independently of then’s is n tJ < 2[(n - 1 )!n\ npi Write 


V { Y ppswr) _ 



N d 2 * . 

V , where dj = Y t - P, Y; ^ = 0 

1 1 nP i /=! 


When 7t, = w/ 5 ,, write 


7 


A 

f(f Ht)pPSWOR = X X (ft/TC j -Ttjj ) 
i * j 


d, dj 




-z 


dP 

l 

f 

(1-71/ )+ X X 

V 7I,V 


\ 

fd 2 ) 

l 

+ XX 

i*j 

( \ 

7 lij 

U i) 

< 7C | TC y / 


Vtt; nj J 

fdid) ' 


(it j -71,71 , ) 


N V 


- Z di 

i-l 7 


Hence, 


f(f Ht)pPSWOR - ^(f ppswr) “XX <//^y 

i*.i 


7 ly 

n i n j 


Thus, the Horvitz-Thompson estimator based on sampling without replacement 
with inclusion probability proportional to size has a variance smaller than one 
based on the corresponding sampling with replacement if 
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N 

for all real numbers d\, di, . . du subject to restriction =0. (Gabler, 

/=1 


1981). 

A 

7. Kht is the only unbiased of the class in which the same weight is 
attached to a particular population unit whenever it is selected (Horvitz and 
Thompson, 1952). 

8 . Kht has the ratio estimator property in the sense that when T,’s are all 
exactly proportional to corresponding 7 t,’s, V(Ym) is zero. 

9. For a fixed sample design with 71, > 0, the HT estimator of population 
total or mean is admissible in the class of unbiased estimators (Godambe and 
Joshi, 1965). 


10. Under the model Y, = Ptc, + e„ where E m (e ,) = 0, E m (e,e,) = 0 for / * j 
and V w (e,) = a 2 n 2y , [}/ 2 < y < 1), the expected variance of the Horvitz- 
Thompson estimator achieves a lower bound of the expected variance for any 
design unbiased estimator (Godambe and Joshi, 1965). 

11. Every non-negative homogeneous quadratic unbiased variance 
estimator of the HTE must necessarily be of the class of estimators given by 

Est. V(Y ht )= lZb ij (s)(z i -z j ) 2 , 
i<i 

where (i) ^b ij {s)p{s) = ii i Tij - 7 t v ±<i*j<N. 


S3 / 


The estimator becomes non-negative with 

bu(s) = - ^fc,,(j),i</<yv 
/>/) 

if for every sample 5, p(s ) > 0 and B(s) = [b,/(s)], is a n x n negative 
semi-definite matrix. 

A 

Est.sYG y(Y ht) is one such estimator and for n = 2, is the only 
non-negative unbiased variance estimator (Vijayan, 1975). 

12. HT estimator can be used with PPSWR design depending only on 
distinct units, where 7i, = 1 - (1 -p) n . 


4.3.5 Sampling Methods with Unequal Probability without 
Replacement 

The general requirement for the Horvitz-Thompson estimator of the population 
total/mean to possess ttPS property and restrictive conditions imposed on the 
inclusion probabilities over and above the logical necessity of non-negativity of 

;=e^ 

procedures may be classified following Hanif and Brewer (,98o“ Saw 



Unequal probability sampling | 79 

by Draw method (b) Systematic method (c) Rejective method1(d) Grouping 
method (e) Whole sample method (f) Miscellaneous method. Some of these 
sampling schemes are described below. 

Draw by Draw Methods 

a) Method of Horvitz-Thompson (1952): A sample size 2 is drawn with 
replacement. The probability of selecting the/th unit U, only once in 2 draws is 
2 p, (]-/?,) where/?, is the probability selecting ith unit (/' = 1, 2,. . . AO in any 
draw. Assuming that the probability of theith unit being duplicated in a sample 
size 2 is negligible, we have for the sampling scheme to haveTtPS property 2/?, 
(1 - p^ = 2Z„ where Z, = XJX, X t being the size of the ith unit and X, the 
population total ofx. The solution for/?,, taking the smaller of two roots and 
then norming is given by 

f n _A 

(l- > /l-4Z / )/ , assumingZ, < > 4 . 


V 


/ = l 


J 


If these /?,’ s are taken as selection probabilities on the first draw and selection 
probabilities on the second draw are made proportional to them for the units not 
yet drawn, the inclusion probabilities are 

^i=Pi+Y,Pj [Pi/a-Pj )] 

and i * i 

n (l = Pi Pi [1/(1 -Pi) + 1/(1 -Pj)l 

Horvitz and Thompson viewed that for N> 10, the nPS property should hold 
approximately. 

b) Method of Yates and Grundy (1953): Two units are drawn with prob- 

N 

ability proportional to revised sizes with £ /?,■ = land without replacement. 

i=1 


71; = 


Pi + £ p.i 


pi 




K'v-pjj 

( _ \ 


Pi 


i+5- 


Pi 


i -Pi) 


, where 5 = E [/?,/(1 - Pi )] 
Equating 7t, to 2Z, to safeguard the nPS property, we have 

Pi = 2Zij pi 

As a first approximation, take Z, for/?, on the right hand si 


» A 
1 + 5- Pi 

V 1 -Pi) 


side and thus we get 
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/( „ z, z, 

py ) = 2z,/ 1 + 2 . 1 ' 

Usually, first approximation may seem to be adequate and we may useP, (l) 

N 

after norming it, so that ^ P. (l) = 1. 

i=i 

c) Brewer’s (1963b) Method: First unit is drawn with revised probabilitiesP, 
and WOR. The second unit is drawn with PPS. Thus, 


*i = Pi + '£p i — 
j*i V 1_z / 


=Pi + Z,- T -, where T = V - 

L l ~ z i J ,/=i 


The tiPS property holds if 


Pi+Z\ T- 


i _ 


1 -Z, 


= 2 Z, 


Thus, the revised probabilities are (solving for/?,) 

Pi~[Z/(\— Zj)/(\ - 2Z,)] (2-7) 

N 

That is ,p, varies asZ,(l -Z,)/(l -2Z/). Using ^ Pi =1, revised probabilities p, 


can be written as 


Z,(l-Z,) /y Z / (l-Z / ) = 2Z / (l-Z / ) y 
1 -2Z, / ^ 1-2Z, 1-2Z,- / 


i+y z - 
^ 1-21 


Since 


Z,(l-Z,) _ 2Z,(l-Z,) _ (2Z/-2Zf) 
1-2Z, 2(1-2Z,) 2(1-2Z,) 

_ z,(i-2z ; )+z, _ ir z ( Z, N 


from 


2(1-2Z,) 2 

-Zfd-Z,) If 


1-2Z,. 


which V —-— = - I+Y-— , using EZ = 1 

^ 1-2Z,- 2 V ^1-2Z,J B ' ’ 


PiZj PjZj 

(1-Z/) + (1—Z /) 


Further, 7t (/ = 
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IZiZj 


~ [1/ (1- 2Z,)+1/ (1- 2Z/ )]{l+^Z f / (1-2Z;)} 

It can be shown that SYG variance estimator is non-negative for every 
sample- For n> 2, the reader may refer to Brewer (1975) for details. 

It can be shown that the variance of the Horvitz-Thompson estimator of 
finite population total based on Brewer’s tiPSWOR procedure is smaller than 
the variance of the Hansen-Hurwitz’s estimator based on a PPSWR sample 
involving the same number of draws. 

d) Durbin’s (1967) Method: For n = 2, choose the first unit with probability/?/ 
and the second with probability proportional to Pj [1/ (1- 2 Pi )+1/ (l-2 Pj )], 

j * '' Thus, the conditional probability of choosing yth unit second given that 
the /th is chosen first is 


Pj/i - 7- iPj 


1 1 


1 -2 Pi l-2p-j _ 


N 


Since Pjn=X we have 






!+Z — 

*«, 1-2 Pk 


-1 


= X, say 


Therefore, 

* Pi Pj/i ~ ^Pi Pj 

Hence, 

N 


1-2 Pi \-2p j 


PjPuj 


N 


ni = Pi + Y J PjPi/j =Pi+ Y^PiPjn 
j * i j * i 

N N 

= Pi + Pi Z Pjn = 2 Pi > as Z Pj/i = ^ 
./*! ./* 1 


^ if Pi P j/i P j Pi/ j ~ 27. P j P j 


1 1 

+ 


1 _ 1 ~ 2 /?/ 1 - 2 /?, 

(/’ * / = 1 , 2 ,.. N) 

The condition for selection with probability proportional to p, to be 
possible is that max /», < y r This follows from the fact that the probability of 
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inclusion of ith unit is 2 \p„ which cannot be greater than one. If maxp, = f/, i ts % 
= 1, that is the largest unit is always included. 

For sample size n = 2, Brewer’s method and Durbin’s method belongs to 
same equivalent class. Sampford (1967) has extended Durbin s scheme to 
samples of size greater than 2. 

Sutradhar (1984) showed that for 77 = 2, the HTE of population total under 
Durbin’s scheme has smaller variance than aPPSWR scheme. The proof runs 
as follows: 


N 

V(y PPSWr) = Yi X Pi 


1=1 



where d, - Y, - PjY. Further, 


f 


y(Y Ht)d = X £ £ ( n i n j - 71 ij ) 
i*j 

* di 


Yi Yj 


\2 


71 j J 


-M——■ 

h Pi i-ipi 


A 

which is less than F(Tppswr)- 

e) Midzuno’s Procedure: Consider fixed sample size n. Select the first unit 
with unequal probability of selection /?, (j = 1, 2,. . ., TV); S p, = 1 and the 
remaining (n - 1) units with equal probability without replacement. In case the 
first unit is selected PPS to size variable x, /?, =X,/X= Z,. 

(1 -Pi)(n-1) 


71, =p, + 


(TV-1) 


N-n n -1 
Pi +' 


N -1 


N -1 


n -1 n -1 (n-\)(n-2) 

= Pi T7-. + Pj + ( 1_ Pi - Pj )■ 


Af-1 


W-l 


(W-l)(JV-2) 


77 - 1 

N-l 


jV-77 , . 77-2 

{Pi + Pj)+ 


N-2 


N-2 


n 


n 


{n - 1 )(t 2 - 2)...(ti - r+ 1) 


[)...!■ 




(N-\)(N-2)...(N-r+\)\ N-r 
(77 -l)(77-2)...l N — n 

{N - \)(N - 2)...(N - n + \) N - n 


N-n n-r 

iPi + Pj +■••+ Pr) + 


yv-r 


(P« + Py +-+P W ) 
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'/V-f 
l n- l) 

that is, the probability of selecting a specified set of n units in the sample is 
proportional to total measure of the sizes of these units included in the sample. 
This sampling scheme outlined by Horvitz and Thompson (1952) was 
attributed to Midzuno presumably because this leads to Midzuno’s (1950, 
1952) selection procedure for selecting samples with probabilities proportional 
to the aggregate measures of size (PPAS) of units in the sample. However, as 
stated in literature, this procedure was also independently suggested by Sen 
(1952). It may be seen that for Midzuno scheme of sampling 

H,7i, - n tJ = {N — n)l(N - 1 ) 2 [(/V - n) PiPj + {n- 1)(1 - Pl - Pj)/(N - 2)], 

which is always non-negative. But unfortunately?!, is not proportional to p, or 
Z,. For 7 iPS property 

N -n n— 1 

7 i i =- pi +-= nZj, where Z, = X/X, 

N-\ N -1 


~(Pi + Pi +.~+p„)l 


solving which for/?,, we get a ‘revised set of selection probabilities’ as 


* N -\ n -1 , , A 

Pi = nZj — -—-, 0 - 1, 2,... N). 

N-n N-n 


Thus, for Midzuno scheme with revised probabilities 7 i, =np* and 


71 , 


n(n - 1 ) 
N- 2 


Pi + Pj ~ 


1 


N - 1 


As p* must necessarily be positive, we requireZ, >(«- 1 )/((A- 1)«), which isa 
very serious limitations on the applicability of the scheme. 

Asok and Sukhatme (1978) have shown that for any sample size« > 2, 
Midzuno scheme with revised probabilities p* leads to a HT estimator of 
population total with smaller variance compared to the variance of the 
customary PPSWR estimator. Further SYG estimate of variance under 
Midzuno scheme with revised probabilities is never negative. 

Prasad and Srivenkataramana (1980) have used a transformation on the 
study variatey to obtain an estimator of the population total more efficient than 
the HT estimator under the Midzuno sampling scheme. T.J. Rao (1983b) used a 
transformation on the auxiliary variate and empirically found it as efficient as 
the PS estimator. 

As an extension of Midzuno’s sampling system J.N.K. Rao (1961) has 
suggested another sampling system where the first unit is selected withpps(/?,), 
the second unit with pps of the remaining units as in sampling system and the 
remaining (n - 2) units in the sample are selected with equal probabilities and 
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without replacement. For this sampling system, SYG estimator of variance is 
shown to be non-negative. 


f) Hanurav’s (1967) Method: Order the population elements U\, Uj ,..., U N 
with increasing size asX\ <X 2 <... <Xn, so that normed sizes are Z\ < Zi <.. 

< Z\. 


Hanurav gave the following method, takingZ/v- 1 < Zn. 

^ c 2(1 -Z N )(Z N -Z N _ } ) 

Define 8 = — - — -2L_Li, 0<5<1. 

1 — Z u —Zft_\ 


Now, conduct a Bernoulli trial with 5 as the probability of success in each trial. 
If the trial results in a success, select one of the units U\, U 2 . . . £/v_ 1 with 
probability proportional toZ|,Z 2 , • ■ .,Z^_ 1 and if Uj is selected, then Un and U, 
are regarded as sample size 2. In case of failure replace Z, by 

Z- = z - -, i = 1 ,..., ;v -1; z* = Z' N _,; £ z; = 1 

1-Z V + Z V-1 i=] 


This depends on the assumption on the equality of two largest units. Now, 
select two units with replacement with revised probabilities P* = Z*. If the 
sample consists of two distinct units accept it, otherwise select two units with 
replacement with probabilities proportional to P* 2 . If two distinct units are not 
obtained at the second trial repeat the process unitl the kth trial when two 
distinct units are selected with probability proportional to P* f , t = 2k- 1. The 
process is seen to terminate with probability one. Hanurav has also given an 
extension of his procedure forn > 2. 

Asok and Sukhatme (1976) have shown using Hartley-Rao asymptotic 
approach that the HT estimator under Hanurav’s scheme results in having a 
relatively larger variance than the h’T estimator under most of the7rPS schemes. 


g) Singh’s (1978) Method: Draw two units with PPSWR. If the same unit is 
drawn twice, select unity from the remainder of the population with conditional 
probabilities 


Pin = ^ z ) 


+ 


(S-2Zf) (S-2Z 2 i ) 


where S=ZZ 2 i and X is determined by the relation £/>//, = 1, and is given by 
X = [1+LZj / (S - 2Z 2 . )] _1 . 


71/ =Z/(1-Z, ) + (l-Z/ )Zj +Z? + J £ Z j Xz i 


. 1*1 


1 


+ 


1 


S-2Z 2 S-2Z 2 

1 j 


= 2 Z ; 
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n v =1Z t Zj 1 + AZ,Z 


7 [S-2Zf S-2Z 2 .) 

The condition of applicability of this method is found from the condition/*,/, 0 

for all i and j (/ *y), from which it follows Z^ < ^ Z?, assuming Z# to be the 

1=1 

largest size. It may also be seen that 71,, < 7 t, nj and 7 iy/( 7 r, 7 Cj) > y 2 . 

Kumar, Srivastava and Agarwal (1986) have given the generalization of 
Singh’s (1978) method. 

h) Method of Deshpande and Prabhu-Ajgaonkar (1982): Draw first unit 
with PPSWR. Draw the second unit with probabilities 

p _ Z,(l-Z,) 

( 1-2 Zi)A 

where A is the norming constant. If same unit is drawn twice, then select one 
more unit from the remainder^- 1 units of the population with probability/?,/, 

= Z,/(1-Z,),/*/, 

7t,. =z, + x Z , p, + Yj z iP) -rr ,i=h--,N 

jj*i 1 ^ j 


n ii = 


- 2Z, (on simplification), and 

ZtZj( 1 1 'i 


1-2Z, 1-2 Z. 


, i*j =1 ,...,N 


The authors, after some algebraic manipulations, have shown that 7 i, n, > n 0 . 

Systematic Method 

This procedure was originally proposed by Madow (1949) and was later 
modified by Goodman and Kish (1950). The method is described as follows: 
Arrange the units at random and form cumulative totals of sizes as 

T ‘ = t 1 X i ,(i=l .AO, 

7 = 1 

where a; is the size of the ith unit. To select a systematic PPS sample of size « 

ThenTeT t™ fr0m ' ‘° *• whereK = the sampling interval’. 

In ,h 1 r, T C0 r sp u° n . 8 * R+ JK-J m 0.1...» - 1 constitute the sample. 

p oce urete/th unit is included in the sample if i <R+jK<T,iorj = 

Heterminpc ,h th j Selection of a rand °m number between 1 and K 

e sample and A, of the numbers are favourable for inclusion of/th 
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unit in the sample, naturally the inclusion probability is 7 r, = XJK = nXj/X= np 
If T N /n is not integer, the procedure can still be applied by taking K as the 
nearest integer. In such a situation the actual sample size differs from sample to 
sample and from the specified sample size at most by one unit. This difficulty 
can be resolved by selecting sample in a circular fashion, that is, by choosing 
the random start from 1 to T N =X instead of from 1 to/L Such a PPS systematic 
sampling may be termed as PPS circular systematic sampling. 

The evaluation of 7t y is difficult in this sampling procedure. However, the 
difficulties were resolved by Hartley and Rao (1962) by an asymptotic theory. 

By assuming/?, to be of 0 (A^ -1 ) and n to be small relative to N, Hartely and 
Rao have given expression for F(Tht) to 0(/V -1 ) in a familiar from like 


^ht) = l f)£p,[l-(«-!)/>/ 

i =1 


h-r 

yp, 


which is clearly seen to be less than the variance of the usual estimator inPPS 
sampling with replacement. 


Rejective Methods 

This term is innovated by Hajek (1964), although the method was discussed 
earlier by Durbin (1953), Yates and Grundy (1953) and others. The idea behind 
the procedure is that a sample is chosen with prescribed selection probabilities 
by drawing units one by one with replacement and once the repetition of a unit 
occurs, the entire set of units selected earlier is rejected and the whole 
procedure starts again. 


a) Method of Durbin-Yates and Grundy (1953): For n = 2, select a pair of 
units with replacement with probability proportional to size and rejecting those 
samples which contain the same unit twice. 

The probability of selecting unit/'twice is/?, 2 and selecting unit/' andy'is 
2 p, pj. Thus the total probability of selecting unit/' after rejections is 


n i =(2p i p ] +2piP 2 +...+ 


2 PiPn )/ 


( N 

•- 2>, 2 
V /= l 


\ 


r 


= 2pi(\- Pi)l 


N 


<- 2 >? 

V /=! J 


For n> 2, the formulae for inclusion probabilities get complicated very rapidly. 

b)Sampford’s(1967)Procedure: Assume np t < 1 forall/'= 1,2,.. .TV,because 
if npt = 1, the /'th unit is certain to be included in the sample and the problem 
reduces to selecting (n - 1) units only. Let 
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X, = ——— 

(\~ n Pi ) 

Further let s(m) denote a set of m different units i\, h ,.. i m and define L(m) 
such that 


and 


Lo= 1 


L m = Yj^h (^m<N), 

s(m) 

the summation is over all possible sets of m units drawn from the population. 

The Sampford’s procedure consists in selecting a sample s(ri) consisting 
of units ij, / 2 ,... i n with probability 


P(s(n))=nK n \^ i2 



i -±p iu 

\ U — \ 


where 



-i 


This needs evaluation of probabilities for the set of all possible 



samples 


\nj 

and then drawing of one sample from this set with required probabilities, which 
is not a very practicable proposition. For n = 2, 


P(s(2)) = P(iJ) = Kipi Pj [(1 - 2pi)- 1 + (l _2pj) _1 ], 
which is equivalent to Durbin’s formula for n = 2. 

Sampford has suggested two other alternative methods to select a without 
replacement sample of siz en, achieving P(s(n)) stated earlier. 


First method: Units may be selected without replacement with probabilities 
recalculated at each drawing as per the prescribed rule of Sampford (1967). 
This method is a generalization of a method due to Brewer (1963b). 

Second method: The calculations in .the first method may be rathertedius for 
large populations and hence Sampford has recommended a ‘Rejective Method’ 
based on sampling with replacement, which runs as follows. 

Select n units with replacement such that the first drawing is made with 
probability p t and subsequent units with probabilities proportional to p,/(\ - 
npi). Reject any sample which does not contain n distinct units and start 
afresh. For this scheme ti, = «/?/, Although expressions forTiy are available, the 
computations becomes quite heavy particularly for large N or n. It has been 
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further shown by Sampford that ntn ) - n,, > 0, resulting in non-negative SYG 
variance estimator. 

Gabler (1981) has shown that Sampford’s procedure is more efficient than 
the corresponding unequal probability sampling with replacement for any n. 

Asok and Sukhatme (1976) have provided approximate expressions for7c /y 
for Sampford’s procedure and hence an approximate expression for the 
variance of HT estimator. By assuming/?, to be of 0(\/N) and n is small relative 
to /V, they have shown that HT estimator for Sampford’s procedures (using 
Hartley and Rao approximation) has smaller variance than the customaiy 
estimator in sampling with replacements (PPSWR). 

Grouping Methods 


a) Durbin’s (1967) grouping method: The grouping device brings about 
substantial reduction in the work of sample selection and variance estimation. 
As per Durbin’s grouping technique the units are arranged in groups such that 
each group contains as few units as possible subject to requirement that max/?, 
within each group < half the total of the/?,’s in that group. Now select two units 
from the whole population with PPS with replacement. If two units belong to 
different groups, these two units constitute the required sample of size 2. If both 
selections belong to the same group (not necessarily different), accept the first 
un it and reject the second unit and make a further selection from the same group 
by Durbin’s PPSWR technique (Draw by draw method) replacing/>, by /?/ = 
/?,/£' /?,, where S' denotes the summation over the units in the group. 

Thus, the probability of getting/th unit first and/th unit second when units 
belong to different groups is P(i,j) = ppj and the probability of getting /th unit 
first and/th unit second when units belong to the same group is 

P(iJ) = Kpi (S' pd pi [(1 -2 p/)- ] + (1 -2/?/)-’], 

where 


l' = 


1 + S' 


Pk 


-i-i 


1-2 pi 


In both cases P(i,j) is a symmetric function of/ and j and hence total probability 
of getting /th unit in the second draw is the same asith unit coming in the first 
draw, that is,/?,. Thus 

7t, = 2/?, and ny = 2 P(iJ) 

b) Rao, Hartley and Cochran (RHC) method: RHC method is a grouping 
method, providing an unbiased estimator, an exact expression for the variance 
and its unbiased estimator. The method runs as follows: 

i) Subdivide the whole population at random into n sub-populations or 
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groups Gi, G 2 ,... G n of predetermined sizes N\, Nj y ..., N n respectively 

n 

with ^N k = N. 

k =1 

ii) Select one unit with probability proportional to siz tPj from each of these 
groups independently to get a predetermined sample siz en. 

Define T(i) = X P, as the total of sizes in group i. 

The estimator$f population total f proposed by Rao, Hartley and Cochran 
(1962) is given by 

r RH c = xf—V(0 

Denote £2 and Vi as the conditional expectation and conditional variance given 
a fixed sub-division of the population and E\ and V\ as the expectation and 
variance over all possible sub-divisions. 

E(Y R H C ) = E i yr(i)£ 2 [^| 

_/=l \PiJ_ 


i?\ jeG\ Pi jT(i) 


= E >llnoz =r 


/=i i eG ‘ 


Again, 


V(Y rhc) - £1 Vi (Trhc) + V\ £2 (Trhc) 

= £1 Vj (Trhc), as the second term is zero, 

_ (? 1 / J'i 1 1 _ f V 1 c- 1 / rr 1 f J 


=£,f 2 £r(i) = X— 


i= 1 V Pi. 


n V2 O ,/pi) = n [El (yi/pi) 2 - (El (yi/pi)?] 


= T (i) I I Yj 

j eGj p j Vy'eG, 


( Y Y ■ 

= K II — 

^/' 6C 'V£/ P'j. 


Now the probability that any pair of units (jj) belongs to ith group G, is N(N, - 
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I )/N(N - I). Therefore, 


'■[ T\i) V 2 {y,lp,)\ =J4Z 



N 

Y.'LPjPj' 

(Y, 

N{N -\) 

i*y 1 

U PrJ 


Thus. 


V(Yrhc) = / 2 Z 
It may be observed that 
K^rhc) = 


Nj(Ni-l) 

N(N-l) 


N 


ZZPjPr 

I*.I 


Ll II 

vP, p r J 


\ 2 


>i(Z,N?-N) 


A(A-l) 


V(Y ppswr) 


If A//? is integral, the choice of N, = N/n minimizes the multiplier since 
ZN 1 > n (N/n) 1 . In this case T(Yrhc) = [1 -(n - 1 )/(A^ — 1)] T(Tppswr) and 

Truc has smaller variance than the usual PPSWR estimator. 

If N = nR + k, where R is an integer and k < n, the best choice is to make k 
groups of size (/?+!) and the remaining (n - k) of size R. Thus, 


V(Y rhc ) - 


n -1 k(n-k) 
N - 1 N(N- 1) 


^(Tppswr). 


Now, 



(•[ 

' Y 2 > 

\ 

E 

'z 

i 

. 1 

T(i)~ ^rhc 


Ki=\\P/ J 

y 


Y 2 

— Z —- V (Y RHC 

PI 



Est. V(Y rhc) ~ 


N(N-l) 

iNf-N 

Z N 2 -N 


-1 


V(Yrhc), from which it follows 


A' 2 -zyv? 


2 7 R_y2 
Pf RHC 


IV 


In spite of the simplicity ofRHC procedure, Pathak (1964) has pointed out 
that Y rhc is not admissible in the class of unbiased estimators of Y because it is 
not a function of the minimal sufficient statistic and depends on random 
grouping and therefore Rao-Blackwellization can be applied to derive an 
improved estimator. 
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a) Lahiri’s (1951) Method: Lahiri’s method of drawing a sample with 
probability proportional to aggregate size (PPAS) is as follows. 

For a given sample size n, determine the largest possible aggregate size, 
say B. Now draw a SRSWOR of size n and a random number w between 0 and 


B. The selected sample is retained if ^ x t for the sample is greater than or equal 

/=1 

to in, otherwise the procedure is repeated. The successive drawings of 
SRSWOR has been simplified by an alternative whole sample method due to 
Nigam and Gupta (1984). The underlying principle behind their method is to 

f aO 

number all possible samples 1 to and then drawing one random number 

V n) 


designating one of these samples which are ranked. They have also given an 
algorithm to determine the composition of sample given the rank number. 

Midzuno (1950) suggested the utility of selecting a whole sample 
probability proportional to function of sizes in the sample of which aggregate 
size is a particular case. His object was to obtain an unbiased estimator of a 
population parameter 6. Let t be an unbiased estimator of 0 in SRSWOR. Now 


suppose the sample.? has been chosen with probability proportional to/{xi,x?, 

.. x„) where .vi, * 2 ,.. x„ are values of the size variable* attached to units in s 
Thus, 


Pls)=f(x 1. x„)/ I 

s eS 


Then estimator /* defined ast* = [/(s)/^)] / ( " j is an unbiased estimator of0. 

A simpler method to draw aPPAS sample is by the method of Horvitzand 
Thompson (1952) attributed to Midzuno (a method was also independently 

PPSWOR y H , e h n <l95 r 2)) n here *** ° f the Sam P le is Reeled with 
HPbWOR and the rest (n - 1) units from (/V- 1) units with SRSWOR Thus the 

probability of getting a particular unordered sample is the sum of probabilities 

of getting it with Ui (i = «) being selected in the first draw with 

Probability xJX and the rest (n - 1) units drawn with probability 1 J (N ~' C ' 


Thus, 


/ U-ij 


'(N-t 
\ n-lj ’ 

showing thereby thatp(s) is proportional to the aggregate size of,. Alternat- 
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- /w ,. J . 

ively, p(s) can be written as p(s) = (x/ X) .An unbiased estimator of 

/ {n) 

population total Y is given by (taking t{s) = Ny) 


Y, = 


Ny /(N 


p(s)/ \nj (N - 1 
l n -\, 


P(s) 


=Ux 


With variance V(Y L ) = E(Y 2 ) - Y 2 = 1Y£ p(s)-Y 2 . 

S 

An unbiased estimator of variance (Raj, 1954) is given by 


N -1 


Est V(Y L ) = (Y*)-£- Tyj +—-I Zy.-y/ , 

lx n-1 i*i 


using the results 

_ IV 2 1 ( n \ 

(i )E = 1 Zy] = 

p(s) s V J 


N-i 

n-\. 


1Y? 


(ii) E II y'iYj / p(s) = I 11 y^j = 
i* / J s \ iV j J 


N-i i " 

/7 - 2 / 


Raj’s above variance estimator can not always take non-negative values and is 

f yyA 

non-neaative provided p(s) < 1/ , (Koop, 1963). T.J. Rao (1972) expressed 

\n J 


A^-l] 5 A'j 

\n-\) 

where y, and x< are sample totals of>> and x respectively, in the form of a 

N N 

quadratic form X a u Yj + 2X Xfl,y Y t Y where 

i< i 


“a = 


N-i 
n -1, 


Sil \X 


x - - l, 


a n = 


N-i 
n - L 


S3 1J \ X 


S - - 1 , 
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with a view to apply HT type of estimator of variance of the form 
Est„ V { Y,) = £ a ,A + 2 

7t ' J 


where tt, and ti„ are expressions for Midzuno scheme of sampling which 
ultimately leads to Lahiri’spfy), i.e. H 

It; = I P (S)=^ N ~ H)Xi ' " _1 


59 t 


(N~\)X N -1 


and 




59 ij 


(N-\) 


(A r-n) x^+xj («-2) 


(N- 2) * (yV - 2) 

The sufficient condition given by T.J. Rao (1972) for non-negativity 

for Est r V(Yi) is a,, > 0 for all i and j, which can never be satisfied (J.N.K.Rao 
and Vijayan, 1977). 

Now considering Lahiri’s estimator Yl, it may be said that Yi. is linear in 
y’s and V(Y,.) = 0 if yjx, are all equal. Hence, by the application of 
Rao-Vijayan’s (1977) theorem the necessary form of the non-negative 
unbiased variance estimator is given by 

Est.„r V(Y,) = Z S a „- (s)x,Xj A - Li]\ s eS> 

,<J ' U/ Xj) 

where a,,(s) = 0 if 5 does not contain both /' and j and 

Zl«ij{s)p(s) = a iJ ,(i< j ), 

59 /, / J J ' 

where a „ is the expression given earlier. 

the form " = 2 ’ thS °" ly P ° SSib ' e n °"' negative “biased variance estimator is of 




- a a (s)XjX j 


Zi_ZZ 

V X; Xj J 


\ 2 


a,i ( s ) - a,//p(s ) = 


X 


’ i = {Uj}, where 
X 

-(/V-l) 


units with equal probability without rcni* PPSWOR and the rest (n - r) 

■my without replacement to provide an unbiased 
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estimate of the population total. 

b) Deshpande’s (1982) Method: As a counter part to Midzuno-Sen sam r 
procedure Deshpande (1982) has suggested the following sampling schem 8 

Draw first unit with probability# and reject this unit. 

Draw a sample 5 of size n out of remainder of (N - 1) units in th 
population with equal probability and without replacement. Thus ^ 

1 ^ 1 r L 


p(s) = 


(N-i\its Pl ~' 

rw-A 

\ n ) 

\ n ) 


l ~ I Pi 

i es 


n -1 


n 


N- 2 


N -1 

71, +Tl y - 


,7l,y -(1 Pi~Pj ) 


«(«-!) 


(N-l)(N-2) 


n 


N -1 


For a nPS design, equate n, = n{ 1 -#)/(/V - 1) = «Z„ where Z, = XJX, X, being 
the size of the /th unit andAThe population total of*. Thus it follows # = 1 - (N 
- 1) Z„ which will be achieved if ma xXt <X,/(N- 1), for all I. Further, 

« H - l) 'Z l+ Zj-. 


n ii = 


N- 2 


N-l 


Remarks: i) n, < n/(N - 1) for every /. 

ii) ,/ < n(n - 1 )/(N - 1 )(N - 2) for every / and j. 

iii) As 71/71/ - 7i// is non-negative for each pair (i,j), SYG variance estimator of 
HT estimator of population total is always non-negative. 

As estimator of population total under Deshpande’s scheme with 

1 


p{s) = 


Yd = 


r N -1^ 
, n J 
Ny 


1- I Pi 


/ es 


is given by 


1 


p(s) (N' 
\ n 


. lfp,oc X,J D = 


r y 


\X N -n J 


X , 


where X N n = (NX- nx)/(N - n) 


Est ,V(Y d ) = YI - 


'N' 

X 1 

— 

n J 

v X N - n ) 
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n — 1 / * j 

— j 

e) Method of Das and Mohanty (1973): Das and Mohantv no™ n 
suggested a technique of PPS sampling without replacement to obtain attPS 
samp e. The method consists of obtaining a sample space consisting 
samples, ach of „ distinct units, in such a way that in alt samples therth unft 
occurs ft, times, where R, is an exact multiple of TO (1 = 1 , 2 ,..., N) and given 
any pan 0 units, there exists at least one sample in which they occur together 
The sample space of b samples is formed as combination of two sub-sample 
spaces of 61 and b 2 samples, where 6 , + b 2 = b. The first sub-sample space is so 
formed that in it every unit occurs constant number of times say r and given anv 
two units there is at least one sample in which they occur together. The second 
sub-sample space consists of b 2 samples in which the ith unit occurs (CX, - r) 

times, where C is a multiple of TO,, such that CTO, > r, cjr X, is divisible by n 

and is equal to b. Finally, one sample out of these b samples is selected with 
equal probability selection attached to each sample. 

Gupta Nigam and Kumar (1982) and Nigam, Kumar and Gupta (1984) 
have given different methods to construct sample space by the skillful handling 
of the methods of construction of binary incomplete block designs and balanced 
incomplete block designs and thereby realising first order inclusion probab¬ 
ilities proportional to measure of size (1PPS). Amab and Roy (1990) have 
presented sampling schemes by using symmetrical balanced incomplete block 
design realising pre-assigned sets of inclusion probabilities of first two orders. 

Miscellaneous Methods 


a) Poisson Sampling: This is a method of drawing units with prescribed 
probabilities. For each unit of the population perform a Bernoulli trial with 
probability of success 71 , = nXJX , where Jf, is size of the unit and 0 <n, < 1 Thus 
the Poisson sampling is defined by 

p(^>= n tc, n (1-71,) 

ies izS-s 

and is due to Hajek(1964). 

Because of independence of trials, we have ny = nm, (/* j\ the HT 
variance estimator takes the form 


Est*y^(K„ r )=£ 
1 = 1 


1-7T/ 




The 


sample size v is a random variable and£(v) = sum of the expected values of 
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N 

success in each of the N trials = ^ tc,* =/i, and 

1=1 


TV 


K ( V )=Z 71,(1 -71,.) « 
1 = 1 


Z7T, - Zn? =n-n 2 Z(X, / A' ) 2 , with a maximum value of«(l -«/yV)when 
the sizes are equal. In Poisson sampling there is possibility of an empty sample 
with probability 


N N 

Pq ~ ri (i-7T, )= n 

i =1 /=1 


1- 


nXj 

X 


which reaches maximum in case of equal sizes and tends toe - " as TV —» oo. In 
case of empty sample, the procedure is repeated until a non-empty sample is 
achieved. The possibility of empty sample can also be ruled out by adopting the 
technique of Modified Poisson Sampling introduced by Ogus and Clark (1971). 

b) Inverse Sampling: In a sequence of trials, let the probability of success in 
each trial be p. Continue the trials until x successes are observed. Then, the 
number of trials n needed to obtain x successes is a random variable having 
probability distribution 


p{n) = 


n-V 
Kx- b 


P x 0 - p) n x -,n=x,x+ 1, 


This sampling has the peculiarity that the number of successes is prefixed rather 
than the sample size and is therefore called inverse sampling , where an 
unbiased estimator ofp is given by (x — 1 )/{n — 1), that is, the proportion of 
successes when the last trial (n in this case) to get xth success is ignored. 
Sampford (1962) exploited such an idea and put forward a method of inverse 
sampling with unequal probabilities. The method is, as follows: A sample is 
drawn unit by unit by successive sampling with replacement and with PPS. 
Sampling is discontinued when (n + 1) distinct units are obtained. Let (r + 1) be 
the number of draws required for the purpose. An unbiased estimator of 


population total Y suggested by Sampford is Y s >7 / Pi\ Pi = X t IX , 


r 7T 


(=1 


neglecting the outcome of the last draw and thus basing the estimate on sample 

A 

size r with n distinct units. The variance formula for V(Ys) is very much 
complicated and however an unbiased estimator of variance is 


Est. V(Y S ) 


1 


r(r- 1) 


f y, y^ 2 

- Y S 

V Pi J 


The interested readers may refer to Sampford (1962) and Pathak (1964) to get 
more insight into inverse sampling. 
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4.3.6 Efficiency of Without Replacement Estimators 

The efficiency of Horvitz-Thompson estimator varies from one sampling 
scheme to another. However, this variation is so little in practice, especially in 
large scale sample surveys, that one should embark upon choosing a sampling 
scheme based on other criteria such as simplicity in selection procedure, 
simplicity in variance estimation, general applicability, etc. 

Due to mathematical difficulties in comparing HT estimators in various 
sampling schemes involving complicated expressions forn, h Bayless and Rao 

/V 

(1970) have calculated the expected variance of Yht under a super population 
model to get a simplified expression for variance formula which fortunately 
gets rid of n,/ s. 

Consider a super population model ( r \ p ) 

Y, = $p, + e h i= 1,2,... N, where 
E m (e, I Pi) = 0, En, (e, e, | p h pj) = 0 (/ * j) 
and 


E nM]\Pi)=v 2 


pf ,a 2 >0 and g > 0 


Assuming n, = np h then 

E m syg] = — V pf~' (1 -n Pi ). 

n ,=i 


w hich is independent of7t,/’s. Thus, under the above super population model the 
HT estimator has the same expected variance for all nPS designs. Rao and 
Bay less (1969) and Bay less and Rao (1970) have conducted extensive 
empirical and semiempirical studies forn = 2, n = 3 and n = 4 to conclude that 
the HT estimators do not show appreciable differences in efficiency. They have 
also found through empirical studies that Murthy’s estimator happens to be 
nearly always more efficient than the HT estimator except in certain artificial 
populations. Further, the RHC estimator was consistently less efficient than 
Murthy’s estimator. As pointed out by Hanif and Brewer (1980) the close 
agreements between the empirical and semi-empirical results of Rao and 
Bay less tend to suggest that the form of the linear stochastic model assumed by 
them is reasonably appropriate. More discussions on the empirical comparison 

of efficiency are found in Hanif and Brewer (1980) and Brewer and Hanif 
(1982). 

Considering a super population model (rj. v ), Y, = px, + e u E ni (e,\ xl ) = 0, 

E„i(er | x ,) - <j~x, with 0 <g< 2, E m (eiei I X/x/) = 0(z ^ /’), where x, is the measure 
of size, Rao (1966b) has compared the average variances (AV) of the following 
strategies Hansen-Hurwitz strategy (//,) with TtPS Horvitz-Thompson 
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estimator; Rao-Hartley-Cochran strategy (// 2 ), Murthy’s strategy ^ 
Midzuno-Lahiri-Sen strategy {Ha) and all the strategies with the same« and^’ 
(A/ 5 ). He showed that forg= \,Ha is uniformly best among all the strategies and 
H 2 , H?„ H 5 are equally efficient. Further studies on the comparison of strategies 
for different values ofg, the readers may refer to J.N.K. Rao (1966b), T.J R ao 
(1967), Hanurav (1967), Chaudhuri and Arnab (1979a), Padmawar (1981) 
Bansal and Singh (1985), Amahia et al. (1989). 

Foreman and Brewer (1971) have utilized the intercept model (£) 

V, = a + px, + e h E m (e, | x,) = 0, 

{e, 2 1 x,) = ct 2 xp, E m {e,e, | xix,) = 0 ,i*j , }/ 2 <y < 1, (Smith, 

1938, Jessen, 1942 and Mahalanobis, 1944) 

to compare sampling with and without replacement for selection with 
probability proportional to size. 

Ify and x are poorly correlated Rao (1966b) suggested a more simplified 
model 

f, - p + e„ E m (e,) = 0, V m (e,) = ct 2 , cov„, (e„ e } ) = 0 (/ * j) 
to compare the performance of above strategies. 

The criticism against comparing different strategies with respect to their 
average variances is that the choice may be disastrous if the model fails. This 
may be taken care of by making the super population model as general as 
possible and to robustnes considerations. The concept of robustness 
investigates (i) how much efficiency is lost by the proposed strategy under 

departures from the assumed model and (ii) deriving more efficient alternative 
robust strategies. 

Godambe-Joshi Lower Bound 

For any strategy (p, t) for estimating Y under a super population model (£) we 
have the known result 

Em I0] = EpVm ( t ) + E p [B m (t) ] 2 + V„,(Y) -2 E m [(K- p) E p (/- p)], 
where 


B m {t) E n ,(t - Y) and p - E,„ (Y). If ( p , t) is unbiased for Y, 
E m [V n {p, 0] = E p V m (/) + E p [ B m {t )] 2 - V„, (Y) 

Therefore it follows that 


0>EpV m {t)-V m {Y) 

the averao^ 1 ' 36 ^ 0S ]^ have shown that there exists a lower bound to 

model £ 6 Vanance 0 P‘ un biased strategy under a general super population 
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Theorem 4.6: Assume a super population model^: Y\,...,Y N are independent 
random variables E m (F/) - p, and V m (Y,) = a, 2 (/ = 1,.,/V). For any unbiased 
sampling strategy ip, t ) with the value of first order inclusion probability 71 , 

E m V p (p,t)> SO, 2 (1/71/ -1) 

Proof. Let t(s, Y) = tm + h {s, Y), where h{ s,Y) = t{ s, Y) - t H r and depends 
on S x R N only through those Y, for which i e s. 

From the unbiasedness of/(j,Y) it follows that Z h(s,Y)p(s) = 0 for all Y. 

Vm [f(s, Y)] = V m {tHi) + V,„ [h(s, Y)] + 2 C m [tHr, h(s, Y)] 

Hence, 

EpVm (0 = E P V„, ( tm ) + E pV m [/*($, Y)] + 2 E p C m [tHr, K s , Y)] 

We have, 

E P C m [tm, K*. Y)] = = E p E m [ I %-■-h(s, Y) 

\iss n I 7T/ J 

= I E m \ Z Y ‘~ E - {Yi) h(s, Y)j p(s) 

seS i es 71/ 

= £E„, r IlZ^hVaI z Y)p(s) 

/=! L 71 i 531 

« = 1 L 71/ sii 

= o, 

since F,- - (F,-)/ 71/ and Z/? (s, Y) p(s ) are independently distributed. Thus, 

sii 

Ep V m (0 > (/OT) = Ed/ 2 /*/. 

Hence, 

Em Vp (/) > £J,K*(0 - f„,(F) * Sa/ 2 (1/71, - 1). 

Remarks 

1. The lower bound is attainable in case of tm for a fixed sample size(«) 
design with 7i, a E m (Y,), i = 1,... N. If E m {Yi) = p, = p*,, being a constant, 
any fixed sample size tcPS design, when applied to HT estimator tm, 
attains the lower bound. 

2. The lower bound is the minimum attainable value of the average variance 
under super population model £, assumed in the theorem in the class of all 
unbiased strategies with/? e p„ with a fixed set of!i values. This bound is 
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further minimized by ap e p„with7i/ ocovand thus making the minimum 
attainable value of E m V p (t) to be equal to (La,) 2 /n - So, 2 . 

In the above context it is of interest to search for optimal strategies in the 
sense of minimum average variance in the class of all linear unbiased 

estimators with respect to sampling design/? e p«. 

Let L,i denote the class of unbiased estimators and H n the class of 
strategies (/?, /), p e p„, t e L u . 

Consider model E m (L,) = a, + px,, K»(Y/) = a 2 /, Cov n i (Yi, Yj) = pv 2 fifj(i 
*j= 1,... AO, where a/, p are known constants;/ a known function ofx/; p, c 2 
are unknown and -\/(N- 1) < p < 1. Then, for any p e p n , t e L u 

E m E p {t - Y) 2 > (1- p) [ 1 In (iV/) 2 - = Em E p o {t° K d~ Y) 2 , 

where p e p /; is such that n, (p 0 ) = n'lfl'L'lf = n, (h 

t° K u = I (Y, -a j -fa, )/n io +Z(a t +px ; )=t 0 , say. 

Tam (1984) has shown that (t\,pi) is optimal iff (t\,p\) = (t 0 ,Po)- 

Special Cases: Godambe (1955) considered model witha< — 0, p = 0 and a/ - 0, 
f = x,. Cassel et al. (1976) considered the case with/ = xf. Mukherjee and 
Sengupta ( 1989) approached the problem with a more general model E m (Y,) 
= p, and Cov /H (Y h Yj) = v,y, i*j = 1, ..., N. 

4.3.7 Variance Estimators in WOR Sampling and Their Stability 

The usual variance estimator in question is the Sen-Yates-Grundy unbiased 
variance estimator of HT estimator of population total given by 

-i 9 

Zl-Ll " 

Tlj Tlj\ 

where n, and n,, are the inclusion probabilities for the unit / and a pair of units (/, 
/) in the sample n (fixed), with n, > 0 for all i and 7 1 // > 0 for all (i>j). The 
calculation of variance estimators and accompanied treatments become 
cumbersome because of complicated expressions for ni, for large n. To 
overcome these difficulties Hartley and Rao (1962) derived an asymptotic form 
of 7t /y for Goodman-Kish systematic selection procedure. Brewer and Hanif 
(1982) and Herzel (1986) have suggested some approximate expressions forn,,. 
Asok and Sukhatme (1976) have evaluated approximate expressions of;t,y for 
Sampford s selection procedure. Jessen (1969) has derived an approximate 
expression for Yates-Grundy form of V(Y m ) as 


y '' (nj7i j -nu ) 

Est. SYG V(YHi) - ? ?- 

i* j 71 /; 
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N 


- n- Lti? n 

V{Ym)j =-^ Z1 

2N{N-\) i*j 

1 


KK i J 


by replacing W u * 7t/7i 7 - 7i y by 


^ = 


/V(/V-l)/2 /<./ 


ZZ^ 


/v 


Est../ F(T W r) = 


«- Zti? 


^-ZZ 


2h(h- 1) i*y 


y,- yj 


V7T| j J 


The approximate variance estimators suggested above, although look simple 
and devoid of complicated tcj/’s, are no doubt biased. It may be mentioned here 
that any sampling procedure which allowsTt// to take zero value results in biased 
Sen-Yates-Grundy variance estimator. 

Han if and Brewer (1980), Kumar, Gupta and Agarwal (1985) and Hanif, 
Mukhopadhyay and Bhattacharya (F993)Jiave derived certain model based 
variance estimator from which lessen’s estimator comes out as a special case. 


\ n i n ) 


Est.HB V(Yhi) ~ 


n 


n -1 


N 

Z7l g 


N 

Zti ^ 1 


/=! 


which for g - 2 reduces to Jessen’s variance estimator. Use the result 

2 


/ 

v > 

2 

( 

,, \ 

A 

Y HT 

W- 

II 

lL 

_Yj_ 

<71/ 

n J 

U / 

nj) 


Kumar, Gupta and Agarwal (1985) suggested another model variance 
estimator 


EstKGA V(Yhi) ~ 


Spf~‘ (1 -npj) 
2(« -1) Z pf~ l 


xz 

i*j 


' yj_ 

U ,■ 




5 


which for it, - np, reduces to variance estimator Est.HB V(Yht) 0 f Brewer and 
Hanif (1980). It would be interesting to see that for SRSWOR, the variance 
estimator of KGA turns out to be the usual variance estimator in SRSWOR. 

Under super population model (tv) it may be seen that 

Ep [Em (Est KGA V(Yh,))\ = Ep [£„, (Est.HB V{Ym))} 

= E p [E„ (Est sYG V(Yhi)]. 

Hanif, Mukhopadhyay and Bhattacharya (1993) have suggested another 
variance estimator of the HT estimator by considering (v,/n, - y/nj) 2 for i * j as 
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building blocks for constructing a model based estimator of V{Y hi) syg in the 
form 


EstHMB V(YHi) = 


i N 
n 1 != I 


f (l-«/?/)!I 


Zl 


\ 


71 ,/7 


apf’ 2 + pr 2 ) 


For SRSWOR, 

EstHMB V(Yhi) = — ^ —— V(^-3?) 2 = Est. KHsrswor . 

n N n-lf : 

The HMB estimator is model unbiased and hence design-model unbiased. 
However, the estimator is design biased and is dependent on the value ofg like 
KGA and HB estimators. Empirical studies show that the amount of bias 
involved in HMB estimator is fairly large for the lower range of values of g. 
However, HMB is empirically seen to be more stable than its counter parts 
considered above. 

Rao and Bay less (1969) and Bay less and Rao (1970) made semi-empirical 
studies using super population model to compare the stability of SYG variance 
estimator under different IPPS schemes. 

As the computation of the estimate of variance of the HT estimator in 
without replacement sampling is accompanied by potentially troublesome 
calculation of the joint inclusion probabilities, one may be interested to know 
the consequence of estimating the variance as if the sample were selected with 
replacement. No doubt, this would result in a biased estimator, but the bias may 
not be very significant in large scale sample surveys and moreover there is an 
added computational advantage of variance estimator in WR sampling. In this 
connection we have the following important result due to Durbin (1953). 
Suppose we use the estimator Est K(T PPS wr) given the 7rPS design, the bias of 
Est V(Y PPS wr) is given by Bias [Est. F(T PPS wr)] =(«/(«- 1)) [F(K PPSW r)- 

This result implies that when we use PPSWR estimator for the 7 iPS 
sampling design, the said estimator Est K(T PPSW r) would tend to overestimate 
the^ variance of Yht under 7tPS design whenever F(Y PP swr) is greater than 
V(Yhi ),rPS and alternatively tend to under estimate the variance whenever 
^(Tppswr) is less than V(Ym)n? s- Thus, Est. F(T PP swr) acts as a conservative 
estimator of V{Y hi)*? s for its application in 7 tPS sampling. 

4.3.8 Optimal Properties of Horvitz-Thompson Estimator 
Homogeneous Linear Estimator (HLE) 

A most usual form of homogeneous linear estimator may be written as 
t = t(s,Y)= £ t„ Y h 

i es 
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where (/) t (j, Y) is a real valued function defined on SxR N (ii) for each s e S,t 
( 5 , Y) depends on those values of Y, for which i e s (iii) t s , is the coefficient 
associated with Y h depending both on sampler and the unit / e s . 


Linear Parametric Function and Estimability 

Let 6( Y) be a linear parametric function (Fi, Y 2 , ..., Yn) of the form 6(Y) = 

N 

where ai, 0 C 2 ,..a n are real numbers, not all zero. 

/= 1 


0(Y) is said to be estimable with respect to a sample design!) = ( S,p(s )) if 
thereexists a function of Yfs (/' 6 s) say t(s,Y) such that 0(Y) = Z p(s)t(s, Y), 

a eE 

uniformly in Y e /? w . r(5, Y) is called an unbiased estimator of0(Y). Now, 

A' N Y 

_ V a i r ‘ 


/ = ! 1 = 1 71 i 


, = | 71/ S31 J 


7t;, 71 / > 0 


ies n 1 


Since 0(Y) = Lp(s) t (5, Y), this implies t(s, Y) = Z —^. Put ai = a 2 =.. 

*es 7U / 


N 

= av = 1, then 0(Y) = F; = F and hence the following theorem. 

«=1 


Theorem 4.7: For every /?, there exists a homogeneous linear unbiased 
estimator of F of the form *( 5 , Y) = I t Y i /n i = t m \ the Horvitz-Thompson 

ies 

estimator, provided 7t, > 0 for 1< i < N. 


Remark 1. A sampling design/? admits a linear unbiased estimator of Y if and 
only if it has n, > 0 for all i. 

Remark 2. If p is a unicluster design n, > 0 for all /, then t H r is the only linear 
unbiased estimator of Y (Lanke, 1973). (In a unicluster design two samplessi 
and 52 are either disjoint or equivalent.) 

The unbiasedness of tHr requires the condition that 7 t, > 0 for all i. To 
prove uniqueness, supposer(s,Y)= I f s ,F,- is unbiased for Y. Since each unit/ 

i es 

in U belongs exactly to one of the possible samples 5 i ,s 2 ,...s M (say), we have 
X ?(** ) X <s,i Y i = Z Y < 

tes k i= 1 
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Thus, t Ski = \/p ( sk ) and the uniqueness follows. 

Theorem 4.8: The necessary and sufficient condition for an estimator^, Y) = 
I t si Y h seS to be unbiased for population total Y under a given design^ is 

i es 


S t si p(s)= 1,1 <i<N. 

53/ 


Proof. Now E[t(s, Y)] = Z p(s) 

5 


i ts<y, 



53/ 



implies 

Z t s tP( s ) = 1, 1 </'<, 

531 

Note : For a non-homogeneous linear estimator (NHLE) of the form 
„ t(s, Y) = t s + X t si Y /, 

/ G 5 

the condition for unbiasedness includes another condition Z ^ p(s ) = 0 • 

5 

Remark: There might be many choices of the estimator^, Y). However, in 
practice one forms an estimator which looks like a sample copy of the 
population parametric function. Murthy (1963) and Srivastava (1985) have 
provided a general approach to construct a class of unbiased estimators for the 
finite population parametric functions under any arbitrary sampling design. 
This has led to find out a general solution to0(Y) = I.p{s)t (s, Y) leading to a 
generalized class of Horvitz-Thompson estimators. 

The choice of an ideal estimator from among theestimators under a given 
sampling design D boils down to choosing one having minimum mean square 
error uniformly in Ye R N . When the condition of unbiasedness is imposed on 
the estimator, our attention is centred upon finding a uniformly minimum- 

variance unbiased estimator (UMVUE). 

An estimator t(s , Y) is said to be a UMVU estimator under a given design 
D for estimating parametric function 0(Y) if and only if, given any other 

estimator t*(s , Y), 

i) E P [t(s, Y)] = E p [t\s, Y)] = 0(Y), for every Y e R N . 

ii) MSE^Cs, Y)] < MSE/?[f(s, Y)], for every Ye R N . 

In the following we discuss already available discomfort general result 
that in the finite population sampling UMVU estimators generally do not exist. 
This happens primarily because of the peculiar situation in a finite population 
context where identification of units by labels becomes a major requirement 
without which the sampling design becomes ill defined and hence cannot be 
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meaningfully implemented. Although the label part of the data is information 
less, a great deal of information will be lost if the lables are ignored because in 
that case one cannot relate the components of the observation to population 
units. 

Theorem 4.9: A UMVU estimator for a given non-census design/? and for a 
general parametric function 0(Y) depending on Y = (Xi, X 2 , . . . Yn) does not 
exist. 


Proof. Let t(s, Y) be a /?-unbiased estimator for estimating parametric 
function 0(Y). We require to minimize V p [t(s, Y)] uniformly in Y. Assume t\s, 
Y) to be a UMVUE. Then, for every t other than t* 

V P [t\s, Y)] < VMs, Y)] 

for every Y and V p {f) < V p {t) for at least one Y. 

Now fix a point Y 0 in R N . Define 

f{s, Y) = t(s, Y) - t(s,Y 0 ) + 0(Y O ) 

E P [t\s , Y)] = E p [t(s, Y)] = 0(Y) for all Ye R N . 

Now Y)] = £ p [r(5,Y)-r(5,Y o ) + 0 (Y 0 ) - 0(Y )] 2 is zero when Y = Y 0 . 

Hence at a given point 7^, the variance of a suitably constructed unbiased 
estimator can be made equal to zero. As Y = Y 0 is arbitrary, it is concluded that 
in order to become a UMVUE t*(s, Y) must have variance identically equal to 
zero for all Y e R N . This is only possible if the sampling design corresponds to 
total survey of the population (census design). Hence, it is proved that for a 
non-census design a UMVUE does not exist. 

Now question arises - Does a UMVU estimator exist in a restricted class 
of either homogeneous linear estimators or non-homogeneous linear 
estimators? 

This is discussed in following theorems. 

Theorem 4.10: A UMVUE for the population total T{ Y) = Xdoes not exist in 
the class of homogeneous linear unbiased estimators (HLUE) of the form 

*{s, Y)= Z t si Y, 

i es 

unless the sampling design is a unicluster design, with 7 i, > 0 for all i e U. 

Proof As t(s, Y) is an unbiased estimator, £[/($, Y)] = Y, uniformly in Y. 
This implies, 


Y)] = I p(s) 

S 


I t„Y, 


N 

= E 
1 = 1 


Z t si p(s) 


13S 


N 

= K = ^ Y h uniformly in Y. 

1=1 
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Then, for all /', the condition is E t si p(s)= L 

53/ 


To derive a UMVU estimate the technique is to minimize V[t(s, Y)] for t s , 
subject to restriction ^ t si p(s) = 1 for/= 1,2,... This leads to the problem 

of minimizing 53 / 


2 


£,[r(s,Y)] 2 =£>(*) 

5 


I 


s.t ^hiP(s) = 1, /'= 1,2,.. ., N 

S3 f 


Using Lagrangian multipliers, we have to minimize 




- 

2 N 

- 

E t„r, 



L ,ej 

/’= 1 

53 / 


for z.sv. 


The minimizing equations give (2Yj Ztsi Yi - h)p(s) 0 t i 1,2,. 

Consider the particular case when 

i) Y, * 0, (ii) Yj = 0 for every j ± i (iii) s 3 i (iv)p(s) > 0. 

Thus, a possible solution is derived as tsi - A,// 2Yi 2 = t h say, which is free of 
s. Thus, 

t(s, Y) = Z t t Y h 

l €5 

which necessarily implies// = Mn h n, > 0. 

This also indicates that if a UMVU estimator exists, it must be no other 

than HT estimator. 

Further, X Y i /n i =X l / 2 Y,- for every s 3 /', with p(s) > 0 and Y, * 0. 

ies 

Now suppose and,S2 be two samples such that^i 3 / and 52 3 /; p{s 1 ) > 0 
and p{sj) > 0; Y, * 0. Then uniformly in Y, 

Y Y 

X Ii = x — 

tes, 71/ iesjTli 


which should be satisfied for the existence of a UMV estimator in the class of 
homogeneous linear unbiased estimator. Such a condition is only possible in a 
unicluster design. (In a unicluster design either^] and 52 are disjoint or they are 
equivalent such as design corresponding to systematic sample.) 

Thus, for any non-unicluster design a UMVU estimator does not exist 
among the HLUE’s. 

Theorem 4.11 Among the class of linear unbiased estimators of the form 

'i(r, Y) = »,+ I 

/ 65 
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where t s is free of Y, E p (t s ) = 0 and E p [t\(s, Y)] = Y a UMVU estimator does not 
exist. 

Proof: Write the generalized linear estimator t\(s, Y) irr an alternative 

form 

t, =p„+£p ,i,Yj, 

./=i 


where Po and P/’s are independent of observations and functions of the label part 
of data and /, is sample membership function. 

Denote L„ as the class of generalized linear unbiased estimators of Y. For 
each 7 i/ > 0, L u is never empty because we have already seen that the HT 
estimator 

n y ■ 

t = t(s , Y )=£ / i- L 

M 71 J 

is a member ofZ, M . Now let Y 0 = (a\,a 2 ,..., on) be a fixed point in R N . Define 



N 

•When Y = Yo, t\* = ^ a j and hence V{t\*) is zero. Since Y = Yo is arbitrary, t\* 

/= l 

is UMV iff it has zero variance at all Y e R N . This is possible iff p is a census 
design. 

Thus, it follows that for any non-census design there can not exist a 
UMVU estimator in the class of generalized linear unbiased estimators (Basu, 
1971). 

Non-existence of uniformly minimum variance estimator in the class of 
all unbiased estimators brings us to a question whether there exists an estimator 
at least in a particular class which can not be uniformly improved upon. To such 
a property of an estimator we attach an interesting name called admissibility. 
However, admissible estimator in any class may not be unique. For a 
non-unicluster designs, there may be a large number of estimators of the 
population total admitting admissibility. In the following the admissibility of 
HT estimator of the population total is examined based on the pioneering work 

Godambe (1960) and Godambe and Joshi (1965) and others. 

Theorem 4.12: For any fixed sample size design with re, > 0, the HT estimator 
is admissible in the class of all homogeneous linear unbiased estimators 
(HLUE) of population total T(Y) = Y. 


Proof. Define the HT estimator as tm ^ YJn, and any other estimator 
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within the HLUE class as t = t(s, Y) = t s/ Y/. Now, 

n r 


/ e s 


v(tHi) = X Y r —-i| + z iy,Yj 

i= I V 71 j 


N 

L 1 

i*.i 


71 


71/Tt/ 


1 


and 


no = ms, Y)j 


N 


= ZY 2 


ZtsiP(s)-l 


53/ 


N 


+ ZYY,Y J Yt si t sj p(s)-\ 


l*J \_SBIJ 


At a point S( Y,) = (0, 0,. . Y h . .0), 1 < / < /V, Y, * 0, 

1 


nO - V(tm) = K 2 


ZjsiPis)- 


S3I 


71 


i J 


By virtue of Cauchy-Schwartz’s inequality, 


£ 0 //>(•*) 


53 / 


I PC?) 


53/ 


n 2 


lt si p(s) 


53/ 


Using the relation 

Z p{s)=7ij and Z 0 /P(‘S')=l 


53 / 


53 


ms, Y)] - V(t m ) = Yr 


Z t)p(s )—- 

S3 i 71 : 


> 0 . 


The equality is attained if and only if/,,- = 1 /ti, for every / e .s- with p(s) > 0. 
Considering 

C = u [5 (Yi )| Y ,, * 0] = u [(0, 0. Y h . .0) I Y, * 0 ], 


i = ) 


/=) 


we see that V[t(s, Y)] > V{t m ) for all YeCc R N , with strict inequality for at 
least one Ye C. Otherwise t s , = l/7t, for every i e s, implying thereby that/(^,.Y) 
and tm are identical. This establishes the admissibility of the HTE in the class 
of all HLUE. . ^ 

It may be pointed out here that admissible estimators may not be unique, 
that is, there may exist more than one admissible estimator in the same classC. 
Moreover, the criterion used in a particular case can not be exploited very far to 
choose the best estimator. An estimator which depends on the order of selection 
and multiplicity of units in / is an inadmissible estimator. The criterion of 
admissibility may attach some amount of reasonableness to an estimator and 

ensures that an estimator is uniquely best in C at least at some point Y in the 
parameter space. 
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Admissibility of HT estimator within the wider class of all unbiased 
estimators of population total Y can be proved following Godambe and Joshi 
(1965) and Cassel et al. (1977). 

Theorem 4.13: For any sampling design/? with 7 t, > 0 for all the generalized 
difference estimator^ = £ (Y; - fl| .)/ n ^ a { is admissible in the class of 

« »=i 

all unbiased estimators of Y. 

Proof: Suppose t gd is not admissible. Then there exists an estimator 
'0, Y) = t gd + Ms, Y) - t gd ) = t gd +h (s, Y) 
which is unbiased for Y and is uniformly better than t gd . Hence 

Z h(s,Y)p(s)= 0 for all Y e R N , and E[t gd + h(s, Y)] 2 < E(t 2 gd ) 

That is, Z h 2 (s, Y)p(s)<-2 Z t gd h(s, Y)p(s)for all Ye R N . 

seS 

Since h(s, Y) is an estimator, h(s, Y) is a function defined on Sx R N and as 
such for a given Y its value depends on Y only through those Tj for which i e 5 . 

Let Qw = {Y: exactly m components of Y — a are non-zero}, where a = 
{ct\,. .., ax). We, now, state the following Lema 1 (proof omitted). 

Lemma 1 : \fh(s, Y)p(^) = 0 for all (^,Y) such that/?( 5)>0 and Ye Q,-, then h(s, 
Y)/?C0 = 0 for all (s, Y) and Ye Q r + i, r = 0, 1,2,.... 

Using Lemma 1 we see that forYe Q„, h(s , Y) = 0 for all 5 with/**) > 0. 
Hence, by repeated application of lemma, h(s, Y) = 0 whenever Ye Qi, Q 2 
., Qx for all j withes) > 0. Thus, it follows that A(j, Y) p(s) = 0 for every 5 and 
every in Q if t(s, Y) is to be better than t gd . Therefore, if for every samples 
withp(s) > 0 t(s, Y) is to be better than t gd , then t(s, Y) must be identical with t gd . 
This proves that there does not exist any estimator in the class of unbiased 
estimators, which is uniformly better than/** Hence, t gd is admissible in the 
class of all unbiased estimators. 

Corollary I: For anyp with n, > 0 for all /, t H r is admissible in the class of all 
unbiased estimators of Y. 

The proof follows by puttinga = 0 in t K j. Some general theorems (without 
proof) concerning admissibility are stated in the following for reference. 

Theorem 4.U. Any function^, Y) defined on Sx/?* Which depends on Is Y) 
only through those (/', Y,) for which Y, * 0 is an admissible estimator of its 
expectation T(Y) - £ i(s, Y )p(s) for any givenp (Mukhopadhyay and Tracy, 

Theorem 4.15: For any sampling designp e p„, any estimator 
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'i( J . Y ) = I bkY k with bk> 1, Z(\/b k ) = n 

is admissible in the class of all linear estimators of Y (Godambe and 
1965). 


Joshi, 


Theorem 4.16: For any sampling design/?, the sample mean of distinct units is 
admissible in the class of all estimators of Y (Joshi, 1965). 

Theorem 4.17: For varying sample size design/?, t\(s, Y) is inadmissible in the 
class of all estimators of Y (Godambe and Joshi, 1965). 


Theorem 4.18: A linear estimator t 2 (s, Y) = I b s , Y h with 

f \ 


i es 


b s ,= 1 + 


Z kfli 

\ i es 


Q~ Z qi , 




i es 


where 


0 cj\,. . ., qK are non-zero constants with Q = YLq, 

ii) (wi, . . ., wjv) is a set of positive numbers 

iii) k, = q,/\Vi 

is admissible in the class of all homogeneous linear estimators of Y (Sengupta 
1980). 

Corollary 1: For any WOR sampling /?, Ny s is admissible in the class of all 
homogeneous linear estimators ofF. This follows from Theorem 4.18 by letting 
w>, = q, = 1/A, which makes t 2 (s, Y) = Ny,. A more general result as to the 
admissibility of Ny s in the class of all estimators of Fis due to Godambe and 
Joshi (1965). 


Corollary 2: For any sampling design /?, (Z y t / Z*,- )X where x t is the size 
measure of unit / and X is the population totrfl ofx, is admissible in the class of 
HLE of F This follows by putting w, = q, = XilX, which results in 

'I y>' 


V 


Z Xj 


) 


However a general result is due to Joshi (1965, 1966). 

Corollary 3: Murthy’s estimator under PPSWOR design with n = 2 is 
admissible in the class of HLE of F. Put q> = pi, wt = /?, 2 /(l - /?,), s = (/ j) 
Sengupta (1980) also proved the admissibility of Murthy’s estimator in the 
class of all unbiased estimators of F. 

The technique of proving admissibility of HT estimator is not universally 
applicable. Regarding other methods of constructing admissible estimators one 
may refer to Patel and Dharmadhikari <1977,1978), Meeden and Ghosh (1983 



Unequal probability sampling | 111 


1984), Sengupta (1982b, 1983) and Sinha and Pantula (1986). Roy and 
Chakravarty (1960) advocated for the property of linear invariance as a 
desirable property to be possessed by any estimator of population mean or total. 
Patel and Dharmadhikari (1977) used the concept of linear invariance for 
constructing admissible (linear invariant) unbiased estimators. 

Theorem 4.21 (Godambe, 1960 and Lanke, 1973): For any sampling design/? 
which has n, > 0 for all i, the Horvitz-Thompson estimator tm has uniformly 
minimal variance in the class of all linear unbiased estimators of Y when the 
parameter Y is restricted to the set R N (\) of vectors in R N having at most one 
non-zero component. 

Theorem 4.22: Ifp is a non-unicluster design, the Horvitz-Thompson estimator 
//yy does not have uniformly minimal variance in the class of all linear unbiased 
estimators of Y when the parameter Y is restricted to the set R N ( 2 ) of vectors in 
R N having at most two non-zero components. 

Proof : Refer Lanke (1973). 

4.3.9 Horvitz-Thompson Type Estimators of Quadratic 
Finite Population Parametric Functions 


The finite population variance wherey is the character under study is 


_ 1 _ 

N 


1 t N 

1 --\y y 2 

NJ£i 1 




which is a quadratic form 

N N 

Q( v) = IXW/. 

i j 


-!«,)? +1 

which is estimable if and only ifVt„ > 0 for all/,/such that a,, + a,, * 0 (Hanurav, 
1966). An unbiased estimator of Q{ Y) is 


e(s,Y)=j; fli/ ZL + s"i:a 4 ^i 

%i ' - 


i*j. 


71 , 


Thus, for any sampling design having the inclusion probability^/, > 0 for all ij 
= 1,..., A, there exists a general unbiased estimator ofo 2 as given 
by Liu (1974) as 
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N/fTlnj N 2 i*i 7t jj 


Liu’s estimator can be alternatively obtained by estimating the expression for 
a 2 , = \/N IX, 2 - y 2 term by term. That is, 


Est. (a l)=1 £ —- Esl (: Y 2 )■■= 1 £ ^L.- [Y,l - Est „• V(Y HT )], 


which simplifies to vj. 

Using this technique, Swain and Mishra (1994) have derived another 
generalized estimator of population variance v 2 y 

V 2 =TtZ— -Estsvo V&ht )} 

N /=! 71 / 





1 


2N 


II 

i*j 


7t /7t j 71 ij 


71, 


yt yj 


71/ 7t 


with 7t/ > 0 for all values of i and n (/ > 0 for all (/, j). vj and v 2 suffer from the 
same drawback in the sense that they sometimes assume negative values. 

Another alternative estimator of <j 2 y which is always non-negative, is 
given by Chaudhuri (1978) as 


v 3 = 


1 v V (T/-T/) 2 
2 N 2 i*j 


Tl 


The variances of vi, v 2 and v 3 and their estimates have been derivedrespectively 
by Liu (1974), Swain and Mishra (1994) and Chaudhuri (1978) The variance 
expressions are all quartic function of/s. In an empirical study covering large 
number of natural and artificial populations (Swain and Mishra (1994)) have 
shown that v 2 takes negative values less often than v, and for most of the 
populations v, is more efficient than both v, and v 3 . Other unbiased estimators 
(not necessarily always non-negative can be obtained following techniques 
enunciated in section 4.2.7. Liu and Thompson (1983) have shown that no best 
estimator exists for the class of estimators which are linear in the function (v, - 
^ .and moreover the estimator is an admissible estimator of the population 

may refer t0 Chaudhurl Mukhopadhyay 

1989 1990 f 1 Sengupta (1988), Mukhopadhyay and Bhattachatya 

variance } 6 deta “ S °" the estimati °" ° f population 
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NUMERICAL EXAMPLES 

Example 4.1: Compute the relative efficiency of unequal probability with 
replacement sampling with equal probability without replacement sampling to 
estimate population mean for/7 = 1 from the following data 


Table 4E.1. A hypothetical population (N = 4) 


Unit 


1 

2 3 44 

Yi 


3 

5 6 10 

P, 


.1 

.2 .3 .4 

Solution : 




Y, 

Pi 

Yi/P; 

( YilPi-Y ,| 2 P i (Y i /P l -Y ) 2 ■ 

3 

.1 

30 

36 3.6 

5 

.2 

25 

1 2 

6 

.3 

20 

16 4.8 

10 

.4 

25 

1 4 

24 



9 


V(Y ppswr) - Pi (Y,/Pi - Y) 2 = 9 

V(Y srswor) = N(N - 1) iS 2 = N(L y 2 — NY 2 ) 

= 4 [170-4x 36] =104 

A 

% R.E. = ^kswor ) x , 00 „ /o 

V(Y \ppswr ) 

= 104/9 x 100 = 1155.56%. 


Example 4.2: A pilot scheme for the study of cultivation practices and yield of 

guava crop was carried out by I.C.A.R. in Allahabad district of U P (India) 

during 1960-61. The data below give the number of guava trees of 19 villages, 

se ected from Chad Tahasil with replacement with probability proportional to 

area under fresh fruits. The total area under fresh fruits in Chail Tahasil is 2051 
acres. 

standarderror t0ta ‘ " Umber ° f 8 “ aVa in ,he tahsil alon 8 with its 
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Table 4E.2. Number of guava trees and area under fresh fruits. 


SI.No. of 
village 

No. of guava trees 
fruits 

Area under fresh selected 
in acres 

1 

25801 

127 

2 

2919 

32 

3 

10785 

75 

4 

12084 

57 

5 

12771 

68 

6 

26410 

61 

7 

1864 

6 

8 

5959 

27 

9 

1645 

68 

10 

1799 

13 

11 

1460 

8 

12 

1180 

8 

13 

4040 

22 

14 

1378 

10 

15 

1407 

30 

16 

19088 

99 

17 

1063 

9 

18 

11128 

63 

19 

1991 

13 


Solution : Let y, be the number of guava trees in the ith selected village,/' 
= 1,. .. 19 and A, be the area under fresh fruits in the /th selected village. 

N 

/ 4=^A i = total area under fresh fruits in the tahsil 

/= i 

= 2051 acres (given) 

The probability of selecting /th village Pi = A,/A, (/ = 1,.. N). 

Define Z, = yJNP, = {A/N) I yM, 

An unbiased estimate of population total number of guava trees (f) is 
Y = NZ n = (Nln) {A/N) I (yJAi) 

= ^y[A = — x 3303.926=356,650 acres. 
n A, ) 19 


- A 2 

Est. V(NZ„) = - 



\AjJ 


n 


= 12300.022 x 144,125.717642 
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= 177,274,6331 
Estimated S.E. {NZ n ) = 42104. 

Example 4.3: Calculate 7t, and ity for the following artificial population of four 
units with * as the size variable when two units are selected without 
replacement with PPS at each step. 

Also, compute the variance of the H-T estimator of population total. How 
does it compare with the variance in case of SRSWOR and PPSWR? 


Table 4E.3. Hypothetical population 


SI. No. of unit 

y 

X 

1 

3 

1 

2 

7 

3 

3 

9 

4 

4 

25 

12 


Solution : Calculation of inclusion probabilities. 


Sample No. 

Units 

Probability of ordered pair 
of units (Pj.Pjii) 

1 

(1.2) 

(1/20).(3/19) = 0.0079 

2 

(1.3) 

(1/20).(4/19) =0.0105 

3 

(1.4) 

(1/20).(12/19) = 0.0316 

4 

(2,1) 

(3/20).(1/17) =0.0088 

5 

(2,3) 

(3/20). (1/17) =0.0353 

6 

(2,4) 

(3/20).(12/17) = 0.1059 

7 

(3,1) 

(4/20).(1/16) =0.0125 

8 

(3,2) 

(4/20). (3/16) = 0375 

9 

(3,4) 

(4/20) .(12/16) =0.1500 

10 

(4,1) 

(12/20).(1/8) = 0.0750 

11 

(4,2) 

(12/20).(3/8) = 0.2250 

12 

(4,3) 

(12/20).(4/8) = 0.3000 


7ri = 0.0079 + 0.0105 + 0.0316 + 0.0088 + 0.0125 + 0.0750= 0.1463 

71 2 = 0.0079 + 0.0088 + 0.0353 + 0.1059 + 0.0375 + 0.2250= 0.4204 

71 3 = 0.0105 + 0.0353 + 0.0125 + 0.0375 + 0.1500 + 0.3000= 0.5458 

714 = 0.0316 + 0.1059 + 0.1500 + 0.0750 + 0.2250 + 0.3000= 0.8875 
7r 12 = 0.0079 + 0.0088 = 0.0167 

Tils = 0.0105 + 0.0125 = 0.0230 
7ii4 = 0.0316 + 0.0750 = 0.1066 
7 t 2 3 = 0.0353 + 0.375 = 0.0728 
7124 = 0.1059 + 0.2250 = 0.3309 
7134 = 0.1500 + 0.3000 = 0.4500 
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Calculation of V(Y H t) 


Units (/', j) 



.229,456 

.373,105 

.484,398 


npij-Tiij (Yi/m - Yjlnj) 2 


.0167 

.0230 

.1066 

.0728 

.3309 

.4500 


.044,804 
.056,850 
.023,241 
.156 
.042,205 
.034,398 


14.8610 
16.1299 
58.7246 
0 

132.6689 

136.4084 


V(Ym) = S I (tc/TI j -njj ) —- — = 13.2432 

' />! 71 / 71 j 

~ 1 A/ / y N 2 

F(Kppswr) = - Y />■1 — -y =42.9115/2 = 21.4558 

«,=i vp,- y 


V(Y srswor) - — Hi T V Y 2 - AT 2 1 

»(A/-1) ' J 


, 4(4-2) 

(2)(3) 


[764-484]= 373.33 


37333 

R.E. (PPSWOR over SRSWOR) = —— x 100= 2820% 

13.24 

R E (PPSWOR over PPSWR) = x 100= 162% 

13.24 


Example 4.4: Drawing a WOR sample of size 2 from the following artificial 
population (Yates and Grundy 1953) compute the estimates of relative 
efficiency of unequal probability without replacement schemes - (i) Des Raj’s 
WOR scheme (ordered) (ii) Murthy’s WOR scheme (unordered), (iii) 
PPSWOR (PPS at each step) sampling, (iv) Hartley-Rao PPS systematic 
sampling (v) Rao-Hartley-Cochran method of sampling and (vi) Midzuno-Sen 
scheme of sampling to estimate the population total Y. 

Table 4 E.4. Population A (Yates and Grundy, 1953)_ 


Unit No. __2_ . _ Yi_ 

1 01 0.5 

2 02 1.2 

3 0.3 2.1 

4 0.4 Q rs 
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Solution : It is supposed that the units 2 and 4 are selected in all the 
methods of sampling under consideration. Thus, the sample observations: y\ = 

1.2, >>2 = 3.2. 

PPSWR Sampling: 


Y ppswr - 


l(li 

2 V Pi 


y£ 

1 

'12 

y 2 J 

~ 2 

.0.2 


+ 



Est. Y ppswr = 


1 

2 ( 2 - 1 ) 




Des Raj’s WOR scheme (ordered): 

Yn = '/ 2 [ Oi/pi) (1 + p\) + iyi/pi) (1 -p\) ] 

= >2 [ (1.2/0.2)(1 + 0.2) + (3.2/0.4)(l - 0.2) 2 ] = 6.8. 

EstviYo) = y A {\- P ^y[^--l^\ =y(\-o.2) 2 (— - — ] om 

\p ] p 2 ) V0.2 0.4 ) 

Murthy’s WOR scheme (unordered): 


Yu = ■ 


(2 ~P\ ~ Pi)\p\ 


Pi )+-(!- P\ ) 


1 

(2-.2-.4) 


12 12 

-(1-0.4) + —(1-0.2) 
0.2 0.4 ' 


= 7.14 


Est. V(Y m ) = ^ P'W P 2 W P\~Pi ) Ti y 2 

(2- p ] - p 2 ) 2 P] p 2 

_ ( 1 - 0 . 2 )( 1 - 0 . 4 )( 1 - Q 2 - 04 ) f L 2 3 . 2] 2 _ 

( 2 - 0 . 2 - 0 . 4) 2 0 +| ~° 39 

PPSWOR sampling (PPS at each step): 

Tti =0.4413,7i2 = 0.7159,7ii2 = 0.2336 
Horvitz-Thompson estimate of Kis 



12 3.2 1 

-+- =719 

•4413 .7159] 
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Est. V{Y m ) = 71,712 ~ n ' 2 A.-I1 

7t I2 71 ] 71 2 


0.3159-0.2336 

0.2336 


[27192-4.4699J 2 =L0798 


Hartely-Rao PPS systematic sampling: 

A 

yHR = y 2 [y\/p\ + yi/pi] = 7 
Est. V(Y m <) = —-^ - \i~2(d,+ d-. 1- 


22(2-1) 1 2 (P' + 


y\ 

yi) 

Pi 

p 2 J 


= /^[1~ 2(0.2+ 0.4)+ 2(0.3)] x f — - —1 =0.4 

^ 0.2 0.47 


Rao-Hartley-Cochran method: 
Group -1 

Unit Prob. y; 

1 0.1 0.5 

2 0.2 1.2 


Group - II 
Unit Prob Y , 

3 0.3 2.1 

4 0.4 3.2 


P'\ 0.3, P '2 - 0.7, yi - 1.2, y 2 = 3.2, p\ = 0.2, p 2 = 0.4, N = 4, n = 2 


y RHC = —— 


y\ , ^2 12 3.2 

- + - • = - + — _ 

Pi f Pi p 2 lp' 2 02/0.7 04/0.7 


= 1.8+ 5.6 = 7.4 


Est. VWrhc) 


_(n -2) r yi ± 

~~ p ' 


(y, -V f v -V 

p; -21 -y +Pi a_f 
^1 7 \p 2 y 


= (4)(2) Ojf —185] +0.7f-ii—185 
V 4 x 0.2 ) 14 x 0.4 

Midzuno-Sen scheme of sampling: 

jV-4, n-2,y\ = \2,y 2 = 3.2, p\ = 2 and /?2 = .4 
_N~n n-1 4-2 2-1 

' ~~\ p ' + JfZ[=T~~\ p ' + I 77 =a4667 - 


= 0.42 


* 2 =^p 2 +-^1=±3 

W-l /V -1 4-1 


4-1 

2-1 

P 2 + — = 0.6000 
4-1 
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*12 =' 


n -1 


N-\ 

2-1 


N-n n-2 

(P i + Pi ) + 


4-1 


N-2 

4-2 


4-2 


(0.2 +0.4) + 


n-2 

2-2 


4-2 

^ ms' = iy\/n\) + iyi/m) = 7.9045 

V..A = ~Tt| 2 ) 2 ' 

*12 

= 3.0517 


0.2 


Est. V(Y m ) 


y i ^2 


\ 2 


V 71 / 7T 2 / 


Estimates of relative efficiency 


SI. 

No. 

Estimate 

Variance 

estimate 

Relative 

efficiency 

1 . 

PPSWR 

1 

100 

2. 

Des Raj’s WOR scheme (ordered) 

0.64 

156 

3. 

Murthy's WOR scheme (unordered) 0.39 

256 

4. 

PPSWOR (PPS at each step) 

1.08 

92 

5. 

Hartley-Rao PPS systematic 
sampling 

0.40 

250 

6. 

Rao-Hartley-Cochran method 

0.42 

238 

7. 

Midzuno-Sen scheme 

3.05 

32 


EXERCISES 

4.1 Calculate inclusion probability matrix for the following sampling designs 

a) t/= (1,2,3 }:d=(S d , Pd ) 

Sj= {^, =(1,3), j 2 = (1, 2), j 3 = (3)} 

P(s\)= \,P(S 2 ) = S 's,p(si)= l/ 8 

b) (7 = {1,2, 3,. .. N};d:P(0=p ft 1< i<N , 

^(1,2,. . ., A/) - 1 - (pi + , . . + 0 <p, < 1 IN. 

4.2 Determine first order and second order inclusion probabilities in simple 
random sampling with replacement. 

4.3 Compare the efficiency of PPSWOR (PPS at each step) and 
Mtdztino-scheme of sampling to estimate the total Y of the following 
artificial population (Jessen, 1978) with sample size 2. 
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Units 12 3 4 

Y i 3 4 7 10 

X,(size) 12 3 4 

Also, compute the design effect and comment on how much efficiency i s 
gained or lost by using unequal probability sampling compared to equal 
probability sampling. 

4.4 Consider the sampling design d: p{ 1,2) = 0.4 and p( 3,4,5) = 0.6. Can you 
construct an estimate of the population total Y which has smaller variance 
than HTE at some point Y () e /? 5 ? 

4.5 Construct a probability sampling design forA^=6 and fixed sample 
size n = 3, which differs from simple random sampling and hasrc, = V 2 and 

71//= 1 5 , 1 < i?tj < 6. 

4.6 Denote /, and I, as sample membership function for the ith andy'th unit of 
the population respectively. Show that for SRSWOR /, and I, are 
negatively correlated. 

4.7 To select a sample under Poisson sampling (Hajek, 1964) perform a set of 
N binomial trials to determine whether each unit of the population is to be 
included in the sample s or not. Ifp, (0 < p, < 1) is the probability of 
success in ith trial, 

P(s)= n pi . n (1-Pi). 

i es i eS-s 

Derive the variance of Horvitz-Thompson estimator of population total. 

4.8 Consider a sampling scheme when the first unit is selected with pps and 
the second unit with pps of the remaining units and the remaining ( 7 - 2 ) 
units in the sample are selected with equal probabilities and without 
replacement. Suggest an estimate of population total and find its variance. 
Show that under the proposed scheme the Yates-Grundy estimator of 
variance is always positive. 

4.9 Compare the relative efficiencies of the strategies due to 
Horvitz-Thompson, Rao-Hartley-Cochran and Sen-Midzuno in estimat¬ 
ing the finite population total under the assumption of a super population 
model 

Yi = PX, + ei, i = 1,..., N 

with E(e, | x,) = 0, E(e, 2 | x,) = cr 2 xf 

and E(eie, \ x h xj) = 0 for z * j, 

where 0 < g < 2 and a 2 is an unknown positive constant (Chaudhuri and 
Arnab, 1979). 

4.10 Let U= {1,2, 3, 4, 5} and the sampling design d be 

/*({ 1,3,5}) = 2/10, />({ 1,3,4}) = 3/10, 
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^({2,3,4}) = 1/10, and P({ 1,2,5}) = 4/10 
Assuming the population values 
Y\ = 5, Yi= 10, K 3 = 2, y 4 = Ys = 3 

determine the Horvitz-Thompson and Yates-Grundy estimates of 
variance of HTE estimator of population total for each sample and 
comment. 

4.11 Let d be a uniform sampling design on U= {1,2,3} with 
5i = (1), 52 = (2) and 53 = (2,3). Are 

i) the HTE of population total Y 

ii) the estimator t(s, y) = 3Y\ if 5 = (1) 

= 3 Y 2 if 5 = (2) 

= 3T 3 if 5 = (2, 3) 

of population total Y, admissible in the class of all unbiased estimators of 
Y? 

4.12 Prove that both Est Vhi(Yhi ) and Est. Vyg(Yhi) assume non-negative 
values if Est. Vm{Y hi ) = Est. Vyg(Yhi)- Show, further that for fixed 
sample size (n = 2) design this is possible if and only if the design is 
SRSWOR (N, 2). 

4.13 (a) Suppose out of three units in a PPS WR sample two are distinct and let 
the sample values of three units with corresponding probabilities be (y'l, 

P\), Oh P\) and O 2 , pi). Compare the efficiency of the following 
estimators of Y: 

i) t\ = 1/3 [2y\/p\ +yilpi\, 

ii) ti = 1/3 [(y\/p\) + iyi/pi) + iy\+yi)l{p] + pi)], 

iii) '3 =y\/[ 1 -(1 -pi) 3 ] +T 2/[1 -(1 -P 2 ) 3 ] 

b) Show that the estimate of f(K PPS ) on the basis of a simple random 
sample is given by 

v S rs(T PP s) = (N/n 2 ) [ 'Lyi 2 /pi - N(n)P- - s 2 )] 

4.14 Suppose a population of N individuals are sampled with unequal 
probabilities and with replacement. The sampling continues until 6- -t- 1) 
distinct individuals have been selected, when (n + ^observations (n > r) 
will have been made. The last observation is ignored leaving/- distinct 

values^, observed n times with ^ n,= n. Show that 
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is an unbiased estimate of mean Y and unbiased estimator of variance is 
given by 



1 


n{n~ 1)/=] 


- 0 ) : 


4.15 Let there be /V units (U\, . . . Un) in the population. The population is 
randomly divided into (; n + k) groups of sizes N\, . . . , A / n + k such that 

n+ k , 

Yn, = N. For the first group select N\ units out of N units with 

/ = i 

SRSWOR and for the second group N 2 out of (N - N\) units and so on. 
From this ( n + k) groups select n groups with SRSWOR. For selecting 
the ultimate sample of n units we select one unit with probability 
proportional to the original probabilities/?* with I/?* = 1 from each of the?? 
groups independently. Show that 


_ n + k ^ y t 

t =— 

n ,=1 Pi 


where rt, = Z is an unbiased estimate of the population total Y. 

group i 


Derive the variance of T. 

4.16 In PPSWR sampling compare the efficiencies of following estimators of 
population mean Twhen the population size and sample sizes are A/ and n 
respectively and p, is the probability of selecting /th unit 

1 it M n 

n ,Ti N Pi 1=1 



4 17 Show that the variance of the Horvitz-Thompson estimator of a finite 
population total based on Brewer’s 71 PSWOR procedure is smaller than 
the variance of the Hansen-Hurwitz estimator based on a PPSWR sample 
involving same number of draws. 

4.18 Compare the strategy consisting of Midzuno sampling scheme modified 
lo a TtPS design and the ratio estimator over Hansen-Hurwitz strategy 
(Mukhopadhyay, 1994). 

4.19 Derive conditions under which the strategy consisting of Horvitz- 
Thompson estimator and a fixed size design with n, oc p, fares better than 
the probability proportional to size with replacement design combined 
with usual Hansen-Hurwitz estimator (Gabler, 1984). 

4.20 Investigate conditions under which Midzuno’s strategy proves superior to 
PPSWR strategy (Mukhopadhyay and Bhattacharya, 1991). 

4.2 i Show that the variance ofHorvitz-Thompson estimator may not generally 
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decrease montonically with increasing sample size or average effective 
sample size (Chaudhuri, 1978). 

4.22 Consider a method of unequal probability sampling where a population is 
sub divided at random into k sub-populations or groups, say, G, (/' = 1, 

.. k) with predetermined sizes N \>... ,Nk. For /= 1,.. .,k draw a sample 
s, of n, elements from G, according to Midzuno’s method with Z, as size 
measure and E/?, = n. Let Z(i) be the aggregated size of G,. Show that 


r,=Zz(0 


i = l 


y(s i) 
Z(s ,)’ 


where y(s,) and Z(st ) are the sample totals corres- ponding to fth 
sub-population, is an unbiased estimator of population total Y. Derive 
variance of Y c and suggest an unbiased estimator of variance 
(Chikkagoudar, 1967). 

4.23 Let there be a finite population U of size N and Fbe a sub set of U, which 
is pre-determined. Let P be a probability vector ... ,pn )• Consider 

the following sampling procedure to select a sample of size/7. 


i) Select one unit from U according to vector P. 

ii) If the selected unit belongs to V, then retain it and select {n - 1) units 
from the remaining (TV - 1) units by SRSWOR. 

iii) If the selected unit does not belong to V, then discard it and select n 
units from the remaining (N- 1) units by SRSWOR. 

Show that 


n, 


n - p 
N -1 


+ 


N-n 


N -1 


pi, ifUi 


eV 


n 


n Pi 


if U( gP , 


N -1 N — 1 
where V c is the complement of V and p = X Ph 


i e v 


71 a = 


n-1 


N ~ 2 ’ (W-l)(Af-2) 


4.24 Let n o - I /V) be the value of fth unit of the population having 
inclusion probability n„ For a fixed sample of size n, 
i) Show that an estimator of N is given by 


n 
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Derive the variance of N. 

ii) Compare the following estimators of population mean Y\ 


a)', = 1 H 

N£In,- 


( 


y,- 



n y 2 (1 - 71 • ) n 71 /; — 7C / 71 ; 

4.25 Comment on / = X Z —- + X X —- y j , as an esti- 

7*7 1) 7T 7 71 ij 7*/ 7C y 7T y 7T yy 

/s. 

mator of HYht) 

4.26 Prove that the Yates-Grundy estimator for the variance of the HT 
estimator of the population total may be inadmissible (Biyani, 1980). 

4.27 Prove that no best estimators exist for the classes of estimators which are 
linear in the functions^,, y lh y, - y,. 

4.28 Suppose a sample is drawn with PPS(z,) and with replacement and the Y, is 
correlated with wi with Lwi = 1 instead of with z,. 

Compare the efficiency of the following estimators of population total: 




5. STRATIFIED SAMPLING 


5.1 Introduction 

While computing the variance of an estimator of population mean or total in 
simple random sampling we have seen that the variance expression is 
dependent on both the sample size and finite population variance. This suggests 
that to obtain a good estimator one should attempt either to increase the sample 
size or to devise certain method of sampling which would reduce heterogeneity 
or variability among the units of the population. One such method is the method 
of stratification which consists in classifying the population units into certain 
number of non-overlapping groups called strata and selecting samples 
independently from each group or stratum. Although the stratification is 
usually adopted to reduce within stratum variation it may also be adopted for 
the administrative convenience of survey operations and field work and if the 
sampling problems vary from group to group and also if specified precision 
differs from one subdivision of the population to another. For instance, for 
administrative reasons the whole geographical area under study may be divided 
into different non-overlapping and unambiguously demarcated sub-areas with 
a field office and necessary auxiliary staff to facilitate supervision and 
organization of field work which may not be so effectively carried out if a 
single administrative office is located in the entire region for the purpose. 
Further, the problems relating to response, measurement and available 
supplementary information may be dependent on the characteristics of the units 
in a sub-population and so vary from one sub-population to another. For 
example, in a survey of industrial establishments, the problems of small 
industrial establishments may be quite different from those of large industrial 
establishment and in a survey of educational institutions problems relating to 
students in halls of residence may differ from those residing outside and in 
agricultural surveys, the problems relating to small farmers obviously differ 
from those of farmers having large holdings. Under such circumstances it is 
naturally justified to divide the whole population into sub-populations with 
independent choice of sampling design and estimator for each of them 
separately, if necessary. If the purpose of the survey demands specified 
precision for specified sub-populations (domains) one can naturally select 
independent samples from each sub-population (domain). For a skewed 
population the stratification might enable the sampler to put the extreme items 
grouped together in a particular stratum and thereby reducing the variability 
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within the stratum with consequent gain in efficiency of the combined estimate. 
Another reason of stratification is that one may stratify the population with the 
help of a suitably chosen size variable in order to recover potential gain in 
efficiency which may be achieved byrcPS sampling and thereby avoiding the 

heavy computations involved therein. 

Stratification introduces considerable amount of flexibility in sampling 
procedures and methods of estimation in the sense that one is at liberty to apply 
different sampling procedure and method of estimation to different strata 
depending on the nature of supplementary information available. The sample 
size to be selected from each stratum may also differ as the cost of making an 
observation might differ from stratum to stratum. 

Stratification presents a better cross section of the population in the 
sample and by bringing into homogeneity within strata it is more likely that 
stratified sampling provides a more precise estimate compared to unrestricted 
random sampling. 

For implementation of stratified sampling the sizes of the sub-populations 
or strata must be known in advance and further, unambiguous frame for each 
stratum must be either available or constructed for the purpose. 

The choice of the stratifying variable from among a host of stratifying 
variables and characteristics requires a judicious decision on the part of a 
survey sampler keeping in view the efficiency, convenience and other 
considerations discussed above. 

Although theoretically sampling designs and estimator may be specified 
for each stratum, it is often a practice to use uniform designs and estimators for 
all strata. 

5.2 Stratified Random Sampling 

Let U be a finite population having representation U = (1,2,..., N). Partition U 
into k sub-populations or strata as U\, Uj,. . . £/*• The number of elements or 

k 

units in h th stratum is N h (h = 1, . . . k). Obviously, ^'N h = N. Select simple 

h=\ 

random sample (s h ) of size n h from h th stratum, h = 1 ,.. k. 

The total sample s = s\ Usi U ... Usk has size n\ + m +... + m = n, say. 

Let y be the character under study and let Yhj be the value attached to 7 th 
unit of the /?th stratum {j = 1 , 2 ,.. ., N h \ h =1 ,. . ., k). 

The population total Y can be decomposed into 

Y= Y\ + Y 2 + ... + Yk = N\ Y\ + N 2 Y 2 + ... + NkYk 
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where Wh - NJN and Yh is /ith sub-population or stratum mean. Furthermore, 
the population mean Y may be written as Y = E W/, Y/,. 

5.2.1 Estimator of Population Mean/Total and Its Variance 

For simple random sampling design in each stratum the estimate of population 
mean Y is given by 

k 

y» = Y. w hy>» 

A = 1 

where 


l h 

y» =—Z yy 

n k j?\ 

and the estimate of population total Y is Y = N%. Now, 

E (5W = £[2 W h y h \ = ZWhE (yh) = ]TW A Y h =Y 


since in Mh stratum y h is an unbiased estimate of sub-population or stratum 

mean Y h (h = 1 ,..., k). This shows that y sl is an unbiased estimate of Y 
Further, 

E (Y) = E(Ny sl ) = NE(y sl ) = NY =Y. 

Now Y(y,, t ) F[Z Wh yh] - L Wh 2 V (yh), as the covariance terms vanish 
because stratum means are independent. Thus, 


ny«) = £w, 


2 ^ ^ S 2 (for SRSWOR in each stratum) 


W N h ) 

= ^W^(l-f h )S 2 /n h 1 

wher efa = n h /N h , called sampling fraction in Mh stratum and 

1 N h 

h) 2 . 


*1 = 




~ l .i =i 


The variance of estimate of population total Y is derived as N 2 V(y st ) 

Standard error of J> v , is computed as S.E. (y sl ) = V( V(y xl )) and that of Y as 

Hm«)). 

5.2.2 Estimator of Variance 

Invoking arguments as in case of unrestricted random sampling it may be seen 
thal ■'* = Z~T, X < yi', - n ) 2 is an unbiased estimate of S ’* 2 and hence for 

n h 1 j = I 
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SRSWOR in each stratum, an unbiased estimate of V(y sl ) is 
Est. V(y sl ) = 'LW h 2 (\-fh)sh 2 lnh- 

An estimate of standard error of the estimate of population mean isEst.S.E.( y st ) 
= V(Est. V(y«)) and an estimate of the standard error of the estimate of 
population total is Est. S.E. ( Y) = N V(Est V(y sl )). 


5.2.3 Allocation of Sample to Strata 

Consider a population where strata sizes N\, N 2 ,..., Nk are known and relevant 
frames are available. Now the question arises as to what should be the size of 
the sample to be selected from each stratum to estimate the population mean? 
The same allocation stands if we need to estimate the population total. 

Optimum Allocations 


Case 1: Total sample size n is fixed : As it is natural to expect that the sample 
sizes aii , « 2 , . . . nk should be chosen such that the variance of the proposed 

k 

estimate is made minimum, we minimize!^,) subject to restriction =n. 

h = \ 

To determine the optimum value of m we use Lagrange multiplier A, and 
minimize 




( k } 

\h = \ / 


Differentiating 'F with respect to «/,, we have 

— tV/i 2 Sh 2 /nh 2 + A, = 0, (h = 1,..., k) 


or, 

Summing 


over 


n h = W h S h N\ 

k c 

all the strata, n = Y W h 

a vr 


Thus, we have nh = n{WhSh)l(£Wh Sh ). 

Such allocation is due to Neyman (1934) and hence is called Neyman 
allocation. 

Case 2. Cost is fixed : Assume cost function 
c = Co + n h 

h = 1 


where C„ stands for overhead cost in a survey andc,, is the cost for observing a 
unit from the h th stratum. 

H/2 C2 

Choosing nh to minimize V{y s( ) = V-'Z h h 
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where, V = HW 2 S 2 / N h is same as minimizing the product 
V(C - Co) = (Z W h 2 ShVrih) (I c h ml 

Using Cauchy-Schwartz’s inequality Za/, 2 Z bh 2 > (Z m bh ) 2 , where equality 
occurs when ajbh = constant, we have 

V(C - C„) = (Z Wj- Sh 2 /n h ) (Z c h m) = 

/ V«7) 2 Z(V^7) 2 * w h s h /j^)j^y 

Thus, for no choice of rih, V(C - C 0 ) can be smaller than (Z ^ S#, Vc*) 2 . The 
minimum value occurs whence#, nh)l{WhShl^m) = constant, that is, n h /(W h 
Shl'lch) = constant. If n = Z m, 

n h _ WhSjJ^_ _ N h S h 

n Z W h S h /^ ZNhS h I yfch 

Thus, the optimum value of m indicates that we take a larger sample from a 
stratum if (a) the stratum is cheaper to be sampled, (b) the stratum is more 
variable internally, (c) the stratum is larger. 

To get the value of n, substitute the optimum m in the cost function and 
solve for n. Thus, 

n = (c-c„)Nj 1 Sj L r^_ 

hS 

Case 3: Variance is fixed. To minimize C for n h for fixed V(y st ) = V - Z Wfi 
SifINh, V{C - Co) as stated earlier, is minimised leading to the same ratio 

min = (W h S h Nc h )/{ Z Wh ShNch) 

The value of n is obtained by substituting optimum n h in the formula for V{y st ) 
and on simplification 

^_ v'* r kS l ,4^)ziv i s k ij^ 

vCy s ,) + l N W h si 

Note-. If c h = c, that is, cost per unit is the same in all strata, the cost C = Co + cn 
and optimum allocation for fixed cost reduces to optimum allocation for fixed 


Non-optimum Proportional Allocations 

asT = All ° CaUon >0 W h : This proportional allocation is defined 

h nW h , an allocation originally proposed by Bowley (1926) This is 
optimal when in Neyman allocation the strata standard deviations are allljual 

Case 2: Allocation proportional to W h Y„: If the variation within strata does not 
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vary considerably, it would be preferable to allocate the sample size in 
proportion to stratum total Yh. That is, nh = n{Wh Yh/Y). 

As Yi, are not known, the allocation, in practice, has to be made in 
proportion to the strata totals of a suitably chosen supplemental^ variable* 
Thus, the allocation reduces to nh = n(W/,Xh/X ). 

Case 3: Allocation proportional to Wh Rh'. In case of populations usually under 
consideration, it is often noticed that/?/,, the range of variable in the h th stratum 
provides an approximation to the stratum standard deviation. Hence, in some 
situations where exact or approximate knowledge of strata ranges are available, 
it may be useful to allocate the sample size proportional to Wh Rh. That is, «/, = 
n(fVh Rh^Wh Rh). 

Case 4: x-optimal allocation : Suppose * is an auxiliary variable highly 
correlated with the study variable^. If S x h, the stratum standard deviation for* 
in h th stratum is known for h =1,2 ,..., £then the*-optimal allocation is given 
by 

m — n. Wh Sxh/YLWh Sxh 

The justification for such an allocation arises when* is highly correlated withy 
and the coefficient of variation of* is the same in all strata. This makes the 
allocation not very far from optimal. 

5.2.4 Stratified Random Sampling vrs Unrestricted Simple 
Random Sampling - Comparison of Variances 


The stratified random sampling is likely to produce more precise estimates 
compared to simple random sampling if strata are properly constructed. If)?/, are 
far from optimum, the stratified sampling may have larger variance compared 
to unrestricted random sampling. 

Consider Neyman optimum allocation where/?/, = n Wh ShTLWh Sh. 

Then = 

n N 


Under proportional allocation with nh = n.Wh , 

V prop (y.sl) = V prop = I>WhSh 2 /n — LWhSh 2 /N. 

The variance of simple expansion estimator under simple random sampling is 
fran (j 5 ) = fran = (l/« - MN) S 2 , where 
( \ \ k N n 

4ZS(vn 2 

— /,=i /=i 


s 2 = 


N 

(X-\) 
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Now, 

Kian — ^prop 


N 

ri_nr 

(A-l)_ 

Vn NJ _ 



K'aOW) 2 -^ (1 -W h )Sl 
N 


The above expression indicates there is likelihood of stratified sampling 
with proportional allocation being slightly less precise than simple random 
sampling ifJVs are all equal. However, in most situations at least for large.V, 
ZWh (Y h - Y) 2 is likely to be considerably more than 1 IN L (1 - Wh ) Sh 2 and 
consequently V(y) ra n becomes greater than V{ysi)prop . As the differences among 
the strata means constitute the major reason for variability ofy, the proportional 
allocation may result in smaller variance for the estimate. Thus for large N, 

fprop ^ fran 

Further, 

Y prop - Pop, = - (Sh -S) 2 , 

n 

where S = Z WhSh, is always positive. 

Thus, f op t < fprop, which is otherwise obvious because f 0 pt is based on 
optimal allocation. Thus, we have for large A, f 0 pt ^ fprop < fran- 

5.2.5 Effects of Deviations from the Optimum Allocation 


Suppose it is desired to use optimum sample size «/,(opt) in stratum h. Then, 
^/j(opt) = n W/, S h /ZW h Sh . The minimum or optimum variance is 

Vcp, (J>«) = -(X.W h ShY - - I Wh SY. 
n N 

Since the computation of optimum sample sizes requires knowledge of stratum 
standard deviation 5/,, which are not known, we can only use approximation for 
optimum allocation. Assuming rih to be near optimum solution actually used, 
we have 

V(y„) = X(Wh 2 ShVm") -(VN)XWhSh 2 
Thus, the increase in variance due to imperfect allocation is 
V(y sl )-Vo f ,(y^ = Y.(W h 2 Sh 2 /n h ') -(\lri)X(WhSh) 2 . 

= X [ (n*( 0 pi)/n) £ tVhShf (1 /«/,’) - (1/n) (XIVhShY 

-= [(X WhSh)Vn 2 ] [X (n h - ttA(opt)) 2 / nY] 

Now, neglecting finite population correction factor, the proportional increase in 
variance due to deviation from optimum allocation is 

f (J'sf ) — ^opt (yst ) _ 1 r ( n h ~ n h( opt) ) 2 
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5.2.6 Estimation of Gain in Efficiency Due to Stratification 
from a Given Stratified Sample 

The estimated variance from a stratified sample is 

Est. V(y st ) = L( W h 2 S h 2 ln h ) - {ZW h 2 Sh 2 /N h ) 

To find estimate of variance that would have been obtained from a simple 
random sample, write 

V ran (^)=[(/V-/7)//V«]5°- 

_ N-n y Yy 2 -Y 2 

hi 

Therefore, 

Est. V ran ( y) = lUzlL [ Est.— I E Y 2 ^ - Est. Y 2 ] 
n(N-l) N 


N-n 
n(N - 1 ) 


±y Nh_ 

"k n h 


t Y Zj-yl+ Est - r(y«) 

; =1 


Thus, the estimate of percentage gain is 


Est .V(y) x 
Est .V(y st ) 


x 100% 


5.2.7 Post Stratification 

Sometimes information as to separate frame for each straturm is not known in 
advance and as such the stratum to which a unit belongs is not known until the 
data have been collected. The stratum sizes N \, N 2 , are assumed to be 

known from some reliable sources. Under such circumstances one may select a 
sample of size n and then classify them according to strata in question. Suppose 
in the sample m h units fall intoMh stratum. Now, with knowledge oiN h (h= 1,. 

., k ) one may build up a weighted estimate as 

k 

J'vz(post) = w h y nih 

h= 1 

It may be pointed out here that J’st(post) is based on the assumption that all w/, are 
non-zero and in case this happens two or more strata are to be combined before 
forming the estimate. The sampling variance ofy st(post) for fixed m h is 

E(yst( P ost)) = 2 ( W, 2 Si 2 /m h ) -(1/AO I W h S h 2 . 

As mi, is a random variable, the correct sampling variance is obtained by taking 
expected value of f(y st(post) ) over variations in m h . Using Taylor series approxi¬ 
mations for 1/w/, and then taking expectations. 
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E { —) = -2L_ + ^h) = _J_ + ]zlh f N - n \ 

\m h ) E(m h ) n 2 W 2 nW h n 2 W 2 {N-\, 

Hence, approximately 

WMpost))] = £ Wh 2 Sh 2 [E (1 Imh- 1 /N h )] 



The first term turns out to be V(y st ) under proportional allocation and the second 
term is increase in variance due to post-stratification. Thus, for large sample 
size, the post-stratification is almost as precise as stratified sampling with 
proportional allocation. 

Agrawal and Panda (1993) have proposed an alternative estimator for 
post-stratification of the form 

y si{ post) = I W ha y m h, where W ha = a (m h /n) + (1 - a) (N h /N), 
a being suitably chosen weighting factor. 

5.2.8 Collapsed Strata 


For a highly heterogeneous population and with a large number of criteria for 
stratification it may be desirable to stratify to the extent of selecting one unit per 
stratum. This procedure has a draw back that one cannot estimate the variance 
for each stratum and hence the V(y st ). In such a situation the usual iechnique 
adopted is the method of collapsed strata. In this method the strata are grouped 
into pairs. Thus, if observations belonging to a typical pair are b'yi, yfi),j ~ 1,2, 

. kl 2. an estimate of V(y sl ) is given by 

v > =77 jilWiiyji -Hjiyji ) 2 

1 7 = 1 


N,\ and Np being the sizes of paired strata. 



The first term on the right is V(y st ) with n h = 1 and the second term is a 

positive bias, which has a negligible effect if the grouping is done in such a way 

that the stratum totals of the grouped strata vary little from each other. 

A more general estimator has been proposed by Hansen, Hurwitz and 

Madow (1953), where the strata are grouped into g groups, each group 

& 

consisting of k h strata so that Y J k h=k- Hartley, Rao and Kiefer (1969) have 

/i=i 

proposed an alternative technique by using data on auxiliary variables 
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correlated with strata means in the population. 

5.2.9 Allocation with More than One Characteristic 


In a large scale sample survey one is generally concerned with estimating more 
than one population characteristic. An optimal allocation for one characteristic 
may be different from an optimal allocation for other characteristics. Of course, 
when the variances of the characteristics are highly correlated optimal 
allocations may differ little among themselves. A very rough compromise 
allocation may be the average of all optimum allocations. Chatterjee (1967) has 
suggested an alternative compromise allocation in which one is to choose the/?/, 
that minimize the average of proportional increase in variance resulting from 
the deviations from the optimum allocation. That is, 

}h ' = 12 ^E n jMopt) jYa n )h(ppt) 

where n^opt) is the optimum sample size in stratum/? for theyth characteristic. 

Several other optimality criteria have been suggested by Dalenius (1957), 
Ghosh (1958) and Yates (1960). Kokan (1963b) discusses this problem and 
proposes a solution using non-linear programming technique. Kokan and Khan 
(1967) have given an analytical solution for the optimum allocation in 
multivariate surveys. Yates (1960) has suggested two useful methods which are 
discussed in the following. Suppose there are p characteristics. Form a linear 
function of variances V, (j = 1, 2 ,... ,p) of estimates of population means of p 
characteristics based on stratified sampling as 

L = a\ V\ + ... + a p V p . 

This is to be minimized for given 


C = Co + £ Ch nh 


The optimum /7/, varies as 



where S^is the variance of j th 


characteristic in /zth stratum. 

In the second approach, we specify the desired variance V 0J for each 
characteristic and minimize C subject to Vj < V 0J (j = 1,... p) with conditions 
0 <nh< Nh. This problem can be solved by non-linear programming technique, 
for which algorithms have been provided by Hartley and Hocking (1963), 
Chatterjee (1966) and Kokan (1963b). 


5.2.10 Allocation Requiring 100% Sampling 


In certain situations the optimum «/, may be found to be larger than N h . This 
happens when the over all sampling fraction is substantial and some strata are 
much more variable than others. If in a particular stratum^ > yV| and there are 
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more than two strata, the revised allocation is 

k 

n\ =N, ,n' h =(n-N x )W„S lt / for h>2 

h = 2 

If n\ > N\ and m > /V 2 , the revised allocation is 

k 

»i =N u n' 2 =N 2 ,ri h =(n-N\-N 2 )W h S H lY J W h S h f<xh>l. 

h = 3 

The usual formula for the variance under optimum allocation is no longer 
appropriate here. The revised formula for the optimum variance may be found 
to be 

(z f w h s h y rw k s; 

y opi \ym) - -;--- 

n N 

when Z' stands for summation over strata for which n k < N k and ri is total 
sample size in these strata. 

5.2.11 Construction of Strata 

It is known that for the stratified sampling to be efficient, the strata should be so 
constructed as to make within strata variation a minimum. One way of 
achieving this object is to have some prior information about the population so 
as to group them into more or less similar units and take the groups as strata. 
The situation becomes ideal if the frequency distribution of the characteristic 
under study is known. In the absence of such information one may approximate 
it from the past data. In case this is also not available, one may presumably 
construct strata with the help of the frequency distribution of some variable 
highly correlated with main variable^ under study. If the number of strata is 
fixed in advance, the equations for determining the best stratum boundaries 
under proportional and optimum (Neyman) allocations have been given by 
Dalenius (1957). 

Lety be the characteristic under study with frequency function \f(y\a < v < 
b, where a = y 0 and b=y k are the smallest and largest values in the population. 
The problem is to find intermediate boundary pointsyi,y 2 ,.. ., y k - \ such that 
VN-opiimum (T.sv) = (l/«) (I WhSh) 2 -(\/N) I WhSh 2 \s minimum. Thefoh stratum 
has boundary points y h ^ and y h . If the f.p.c is ignored, it is sufficient to 
minimize Z W h S h . It may be observed that minimization with respect toy/, only 
affects the terms WhSh and Wh +1 Sh +1 and 

# 

* V// 

Let W h = I ‘ ' f{t)dt the proportion of units in the fth stratum 
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Then, 


d\V h _ 




= f(yh) 


Now, 


S 2 = — P t 2 f(t)dt- 

h W h J >/-' 


— P 

w h < 


-.2 


Therefore, W^S} = T" / 2 /(r)A-(" P /J?' /(*)<* 

n Jyi ,-1 j ^i-i J / >/,_l 


Differentiation gives 




o 




+ 2W h S h 


= y 2 h f(yh)-2 j y " tf(t)dt 

n V/. _ i 


^Va 






yhf(yh) 

Wu 


p tf{t)dt\ 

J y >,-1 


^, 2 

A 


/(.Ta) 


=yh 2 fiyh) - 2 Vhyhflyh) + pa 2 /Oa), 
where ji/, is the mean of>> in stratum /z, given by 

M:, =-^-p </(<)*• 

Adding 5 2 (/ dy A ) to the L.H.S. and the equal quantity S h 2 flyh) 
R.H.S., we get 

2Sh 2 (dWHldy h ) + 2 W h S h ((dS h /8y h ) 

= y^fiyh) -2 y h Hhfiyh) + Hh 2 fiyh) + S h 2 fiy h ). 

Dividing both the sides by 2S/, 

in-yi ,) 2 +s 2 


s h ™± + w h ?2i±=Ikh} 

fyh dy h 2 


That is, ^ hSh = l^Zhl 

fy>, 2 


(yh-flh) 2 +s 2 


Similarly _^ h + ] S h + ] )_ f{y h ) [(JVa M A+1 ) 2 + 5 2 + j ] 

.. , dn 2 ^ 

Thus, minimization fory h gives 



Stratified sampling | 137 


(n-H/, + ,) 2 +S A 2 + , 


, h = 1 ,.. (k- 1 ). 


h +1 


It is difficult to solve the above equation fory/,, as both and £/, depend on y^. 
An approximate method to minimizeEWVv, has been suggested by Dalenius 
and Hodges (1959). This is named as cum V/rule and is discussed below. Let 

A = \ b Jf(t)dtSLi\d A h = \ y " y[f(t)dt 

k 

Obviously, ^ A h = A. Now, write 

h=\ 

Wh= f{t)dt=fh(yh-yh-\), 

J }'li - I 

where is the constant value of fly) in stratum h, and 

& = -t= (y/j -yu -i). 

Further, 

A h =~]7h(yh-yh-\ )• 

Thus, we have 

Vl2 £ w k s „ =£ h (y„ - >*., ) 2 = £ AI. 

* = 1 /i = l 

Theminimum value ofl/l/, 2 with £^ = ,4 is attained when Ah = A/k(h= 1, 

2 . k). This shows that the optimal points of stratification for the Neyman 

allocation ar e obtai ned by choosingy/, so that they create equal intervals on the 
cumulative ^ f{y) scale. 

Mahalanobis (1952) and Hansen, Hurwitz and Madow (1953) suggested 
that the strata be so formed to make W h p/, constant and Ekman (1959) proposed 
a technique by making W h (y h - y h . 0 constant. Hess, Sethi and Balkrishnan 
(1966) have made extensive empirical studies to compare different 
approximate methods of optimal stratification. 

As regards the optimal points of stratification under proportional alloca¬ 
tion, the points of stratification are given by 

yh = ([ih + \Xh + i)/2, h= 1,.. k- 1. 

Dalenius (1957) developed method to minimize!^, when regression 
ofy on x is linear and V(y | x) is constant. Taga (1967) considered this case for 
proportional allocation. Singh and Sukhatme (1969) considered a more general 
set up and derived minimal equations giving optimum strata boundaries on 
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v-scale for the case of Neyrnan allocation along with suggestions of a "umber of 
m thods of finding approximate solutions. R. Singh (1971) has considered the 
simultaneous determination of (i) optimum strata boundaries, (..) optimum 
number of strata and (iii) the sample allocation m for the case of optimum 

allocation of minimizing variance for fixed total cost 

Serfling (1968) has suggested the use of' 'If rule for stratification on the 
auxiliary variable x when the regression of y on x is linear with uncorrelated 
homoscedastic errors and the correlation is positive and very lg • mg , . 

(1975b) has also considered an alternative method of stratification on the 
auxiliary variable and has suggested a new rule, that reduces tov//rule w en 
correlation between y and x is unity. 

5.2.12 Determination of Number of Strata 

The efficiency of stratification increases with the increase in the number of 
strata, but this increase becomes marginal after a certain stage. There may also 
occur situations where increase in number of strata may lead to less 
homogeneous strata and thereby decrease the efficiency of stratified sampling, 
if the principle of stratification is not properly taken care of. 

To have an idea how the variance decreases with the increase in number of 
strata, let us assume that the strata are constructed by the value ofy. Suppose 
further that the distribution ofy is rectangular in the interval (a , a + d). Then for 
simple random sampling V(y) = d 2 !\2n. If equal sized k strata are constructed, 
the variance within any stratum is S 2 v h ~ d 2 l\lk 2 . 

Hence Vn-opi = (1 In) (LIVh S y h ) 2 , where Wh~ 1 Ik and nh = n/k. 


k 1 

It 


\t\k4nk. 


v(y) 


n 


link 2 


which shows that V(y«) decreases inversely as the square of the number of 
strata. 

What happens when strata are constructed with the help of some other 
variable x correlated with y? 

Assume the model ,y = a + p* + e, where V(e) = constant = S e 2 . Now 
V(y,,)= Mn I Wh 2 Sy h 2 

= (*p 2 In) 'LWh 2 Sxh 2 + ( kSeVn) LW h 2 
As for a set of k strata, Ilf/, 2 > \/k, 

Kl s 'fe 2 s? 
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P 2 SJ, 
k 2 


+ (1 — P 2 )Sy 


-S 2 

n’ 


F +(1 - p2) 


where p is the correlation between y and x in the unstratified population. 
Empirical investigations carried out by Cochran (1961) based on above 
regression model indicates that unless p exceeds 0.95, little reduction in 
variance is to be expected beyond k = 6. Dalenius (1957) studied as to how the 
increase of the cost is affected by the increase in the number of strata. 

5.2.13 Optimum Allocation of Sample Size and Prior Distributions 

Cochran (1946) showed that when information on the auxiliary variable^ is 
available, this prior information can be fruitfully utilized to set-up a criterion of 
optimality regarding Y = (Y,,, Y u , .. • Y Wk )' as a realization of a random vector 
depending on X = (X\\,X ]2 ,... X kNk )' and some unknown parameters. Assume 
the super population model % formulated by 

E m (Y h j\Xhj) = a + bX hj 


Vrn (Yhj | Xhj) = a 2 X*hj, g>0 


Co V m ( Yhj't Yh'j Xh'jXh'j) 0, h* h J & j, 


where E m , V m and cov„, denote the conditional expectation, variance and 
covariance given X^ respectively. In most practical situations g happens to lie 
between 1 to 2 as verified by Smith (1938), Jessen (1942), Mahalanobis (1944) 
and Foreman and Brewer (1971). 

For simple random sampling with replacement in each stratum 



n = Ify . 

j 


where Yh, is the value of y for the yth unit of hi h stratum. Now, 

A 


£/// (ct/7 2 ) = h 2 al +a 2 


(N h -l\ 


y N k ) 


LX 


hi 


N , 


.at = 


/V,, 


( 


IX l 

hj 


V 


N h j 


= b2 a l Jr ^ 2 Yj X lj 1 N h , assuming {N h - 1)/ 1 

= (b 2 + ct 2 ) a A + a 2 [X h 2 + MN h (LX* h j - LX 2 ^)] 

Thus, <j h 2 can be expected to be in the same proportion as a, 2 is proportional to 

X,r + (\/N h ) [LXK hl -LX 2 hj \. Thus, 

a/, oc yj(Xh 2 -bh/Nh 2 ), 

where 


5/, = Nh (LA\ - I X*h,). 
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Neyman’s optimum allocation is given by 


nh.opi - n(NhGh/ZhOh) = n 


N h 4Xl^„INl 
I N h 4 X 1 -& h INI 


For g = 2, n h , opt = n{NhX h H\ NhX h ). This result is due to T.J. Rao (1968, 
1977b). 

5.2.14 Stratified Sampling for Proportions 


If we are interested in estimating proportion of units falling into a defined class 
C, we may construct strata such that proportion in class C varies as much as 
possible from stratum to stratum. 


Define Ph = Ah/Nh and pn = adm as the proportion of units falling into 
class C in the h th stratum and in the sample chosen from that stratum .Thus, the 
stratified estimate of the population proportion/’ is 



with 

V (P«) = ~rZ " * ^ h ^ ^~-= I W^P h Q h /n h , neglecting f.p.c. 


The sample estimate of variance is obtained as Est.F(# v/ ) = EfV/, 2 p h qh/(n h - 1). 
The formulae for proportional and optimum allocation can be obtained by using 
technique as in case of estimation of population mean. 


5.3 Stratified Horvitz-Thompson Estimator 


Let n m be the inclusion probability of yth unit of the hi h stratum 
(sub-population) and n I/(h) be the inclusion probability of the pair (/ j) 0 f Mh 
stratum in the sample s h . n m > 0 and n m > 0 for O', j) = 1,..., N h , h = 1 k 
Thus, the estimate of the population mean 

N A =1 A = 1 71 j(h) 


y(y«(Hi)) - 



N„ 

( Y 

1 .7 L\ 

Y .... ^ 

II 

M) 

. 1 J{h) 

i<j 

v 71 i(h) 

n j(h )> 


* (*«*) 71 j (*>- n iJW ) 


and 
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Est. 


1 * 

1 y 

"h 

y y 

( 


2 

w m 71 7(A) “ 71 17 (A) ) 

N 2 fa 

L~i Lmt 

i<j 

V 71 i(h) 

71 7(A )) 

71 0 (A) ) 


For a 71 PS design with ^-characteristic used as the size measure 

— _ ^ v v ^^ v v yJW 

y.si(Hi) - — LL -= — I L -• 

N h j n j{h) N h j n h x hj I X h 


Cassel et al. (1977) discussed the correspondence between the Horvitz- 
Thompson estimator and stratified random sampling for the special case of the 
transformation model. T.J. Rao (1983a) has considered a more general model 
for the same purpose. We now derive the allocation of sample size/?/, with I/?/, = 
n which minimizes the expected variance V(y sl (Hi )) under sthe class of prior 
distributions H = {0} for which 

Et)(Y hl \X hj ) = bX hl 

Vo (Y hl \X hl ) = v(X h/ ) 

cov Vo (Y hh Ywf | X hh Xw/) = 0, h * h\ j *f 


Thus, with n,(h) ^ % 


N z 



A = 1 7 = 1 




71 


m 


v{x hj ) 


l 

N 2 


k N k 

II 


A = 1 7=1 


X h 

n hX hj 


1 


) 


Minimization of E m V(y s t(H'n) subject to the condition I/?/, = n gives 


/?/,,opt proportional to 





which is the 0-optimum allocation. 


Theorem 5.1: Under 0-mod.el, unstratified 7iPS sampling strategy with 
HT-estimator is inferior to stratified 7tPS sampling strategy with the 
corresponding HT estimator and with0-optimum allocation. 

Proof. In case of unstratified 7tPS sampling the expected variance of 
HT-estimator of population total is given by 


Ee [V(Y hi)] = 


II 

A j 




y(x hj ) 
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where n' M = « W*)- Further, for stratified tiPS sampling with 0-optimum 
allocation 


Ee [vtf^HT))] = 


II 

h j 



71 hj 


v{X hj ) 


A ^ 

where n,(h) = nh.optXhj/Xh. It may be seen that E# [V(Y hi ’)] - [ V(Xsi(HT))\ ^ 

0, equality holds if 


X i, ^ 


^(X hj ) 

j X hj 


This result is due to T.J. Rao (1977b). Ramachandran and Rao (1974) invest¬ 
igated whether stratified tiPS sampling with various non-optimal allocations is 
likely to be worthwhile and what are its practical consequences. 


i 


5.4 Stratified PPSWR Estimator 

Let Phj be the probability of selection of theyth unit from the/ith stratum {j - 1,. 
./V/,; h= 1,..., k). Then 

N h 

^ P hj = 1 for each h. 

7 = 1 

Define 7/ ?/ = YhJNh Phj , Th- = Phj = Yh.> P = YLWh Th. = Y. Further, 

1 n * 1 v yh j 

t /j — X t hj — - X - 

n h n h N h P hj 

provides an unbiased estimate off/,., the population mean in hi h stratum. An 
unbiased estimator of the population mean is given by t st = Elf/, t h . The 
variance is derived as 

k \y2 N h 

= I — I Pkj 

A-l n h /=1 

An unbiased estimator of variance off.*/ is given by 

k s 2 

Est. K(f.«) = Y Wj A 

k-\ n h 



where 


*,=- -Xdkj-h ) 1 

n k ~ l j = 1 


Raj (1968) has shown that in case of PPS sampling stratification with 
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A'-proportional allocation with/?/, = nXhlXv/‘\W be always superior to unstratified 
sampling. The gains from stratification would be considerable if the strata 
ratios Rh = Yh/Xh differ considerably among themselves. 

5.5 Controlled Selection 

Goodman and Kish (1950) suggested a method of selection termed controlled 
selection which has some similarities with stratified sampling design and in fact 
it looks beyond stratification exercising further control over it. This selection 
procedure ensures greater probabilities of selection for some or all preferred 
combinations of/? out of N units and smaller probabilities of selection for some 
or all non-preferred combinations than in the corresponding stratified sampling 
design. The need for controlled selection arises from the fact that uncontrolled 
selection might pose serious inconveniences due to inaccessibility of certain 
areas or peculiarities of certain units of the population under survey which 
might affect the organisation and supervision of field work resulting in good 
amount of non-response and reporting errors and thereby putting doubt on the 
reliability of survey results. Hess, Reidel and Fitzpatrick (1961) have used 
controlled selection method in selecting hospitals and patients and have 
illustrated how additional control can be carried out to select one hospital from 
each of two strata consisting of 4 large hospitals (A, B, C, D) and 5 small 
hospitals (a, b, c, d, e). These hospitals have further characteristics that A, B, a 
and b belong to ownership Code 1 and others belong to ownership Code 2. If 
one hospital is selected by SRS from the first stratum, its probability of 
selection is 0.25 and in case of second stratum the probability of selection of a 
hospital is 0.20. If one hospital at random is selected from each of the strata, 
there may be a possibility that one may end up in selecting two hospitals 
belonging to a particular ownership code. To avoid such a situation or to give 
low probability to such a situation, Hess et al., devised a procedure of 
controlled selection to increase the probability of getting samples with different 
ownership codes while retaining at the same time the originally assigned 
probabilities of selection. This is shown in the following Table 5.1. 


Table 5.1, Stratified controlled sampling with probab ilities of selection. 
Stratum 1 



0.15 0.05 0.20 

0.20 
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Stratum 2 -.-- 

d 

0.05 

0.15 

_ — 

0.20 

e 

— 

.10 

0.10 

0.20 

Pi 

0.25 

0.25 

0.25 0.25 

1.00 


The only non-preferred combination (C, e) is having probability 0.10. The 
preferred combinations have increased their probabilities totalling 0.90 as 
against 0.50 in case of uncontrolled selection. Thus, we have seen in the 
foregoing illustration that the problem has been posed in a stratified sampling 
set up with selection in different strata being dependent on each other and we 
have planned to select two units from a two way stratified data — one according 
to size and other according to ownership, with a view to make the units in a 
sample as heterogeneous as possible as regards ownership criteria. This 
problem of controlled selection may be viewed in a general set up to select a 
sample when the number of criteria on which stratification is made (deep 
stratification) is more than the sample size. Keeping this in view Bryant, 
Hartley and Jessen (1960) have proposed a technique making it possible to 
estimate the population mean from a sample size n which is not large enough to 
provide an allocation to each stratum in a two way table. 

Controlled selection can be alternatively carried out by viewing control 
sampling design as a variant of simple random sampling design or in other 
words making controlled sampling equivalent to simple random sampling, with 
sole purpose of unbiasedly estimating the population mean by the sample mean 
with variance equal to that of SRSWOR, besides avoiding totally or partially 
non-preferred samples. This may be possibly achieved by constructing 
controlled design in such a way that the inclusion probability of any pair of 
units of the population in the sample remains equal tonij = n(n- 1 )/N(N- 1), 1 < 
i<j<N which is the second order inclusion probability in case of SRSWOR (Y, 
n ) and this equivalence makes the variance of the sample mean under a 
controlled sample design equivalent to that of a SRSWOR (N, n) design. 
Further, n,, = n(n - 1 )/N(N - 1) implies n, = n/N and as such the sample mean 
under controlled selection also becomes an unbiased estimate. 

Avadhani and Sukhatme (1965, 67, 68, 73) have carried out analytical 
investigations into controlled sampling and also evolved a mechanism to draw a 
controlled sample using Balanced Incomplete Block Design (BIBD). T.ieir 
controlled sampling design is a variant of SRSWOR design with same second 
order inclusion probabilities as that of SRSWOR (N, ri) design. This 
correspondence between the concept of restricted randomization in the 
sampling context and the concept in experimental design first originated in the 
work of M.C. Chakravarti (1963) and further research in this direction are due 
to Wynn (1977), Foody and Hedayat (1977), Sinha (1976), Rao and Nigam 
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(1990, 1992), Hedayat and Majumdar (1992), Sengupta (1982a) and others. 

All possible [ N \ samples of SRSWOR (N, n ) constitute the support of a 
V n J 


SRSWOR design with size ^ J, which has the property of a BIBD with 

v (number of treatments) =N (number of units in the population),/: (no. of plots 

(AA 

in a block) = n (sample size), b (number of blocks) = = size of support. In 


controlled selection our chief objective is to develop a sampling plan equivalent 
to simple random sampling with respect to first and second order inclusion 
probabilities, but with smaller support than SRS. In this connection Chakravarti 
(1963) has shown that a sampling design with uniform selection probabilities 
over the samples in the support (uniform design) is equivalent to SRSWOR 
with respect to all first and second order inclusion probabilities if and only if it 
is associated with a BIBD with N=v and n = k, which has distinct blocks. 

The crux of the problem in the above connection is how to achieve the 
reduction of support size and to what extent. Wynn (1977) have set bounds on 
the improvement that could be achievable. His findings indicate that the 
number of possible samples in the support of the sampling plan can be reduced 


from to between and N by carefully structuring and balancing. Wynn 


(1977) and Foody and Hedayat (1977) have constructed non-uniform designs 
equivalent to SRSWOR by making use of repeated blocks in BIBD in 
increasing the probability of selection of certain samples while still preserving 
the first and second order inclusion probabilities and thus removing the 
restriction regarding distinct blocks in Chakravarti’s key result. Fienberg and 
Tanur (1987) have suggested in interesting application of variants of SRSWOR 
designs where some specified samples would be desired to receive higher 
probabilities of selection because of certain demanding requirement of the 
users. As in case of design equivalent to SRSWOR (N, n ) the probability 
distribution over the support may or may not be uniform. Hedayat and Sinha 
(1991) have identified three possibilities and presented an illustration for each 
of these cases with N = 6 and n = 3. These possibilities are: 


The support size is less than , but the probability distribution over the 

\nj 

support is uniform. 

( N\ 

The support size is less than and the probability distribution over the 

\n) 


support is not uniform. 
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iii) 


The support size is 


\n) 


and the probability distribution over the support is 


not uniform. 


Although for each pair (N, n) there is a sampling design equivalent to 
SRSWOR (N, n) corresponding to (ii) and (iii), there are pairs (N,n) for w ich 
there m ight not be a sampling design equivalent to SRSWOR with support size 

smaller than ^ and with uniform distribution of a probability over the 

\nj 


support, for example the pair (8, 3). For controlled selection manipulation of 
probabilities over possible samples can be made possible by the application of 
linear programming technique as originally suggested by Foody and Hedayat 
(1977) and subsequently taken up by Rao and Nigam (1990, 92), who have 
suggested optimal controlled plans, using a unified approach. A unified 
approach as suggested by Rao and Nigam (1992) runs as follows. Define a 
general linear estimator of the form 

O 


a 

Y = Z d is yi',s eS. 

i es 


Let p 0 (s) be the probability of selecting s under a specified uncontrolled 
sampling plan and/?(.s) be the same under controlled plan {s,p(s);s e 5}, which 
is chosen such thatp(s) > 0, 

Z P(s)= 1 

s eS 


and the unbiased condition 

Zd isP (s)=\,i = 1,...,/V. 


The variance of the Horvitz-Thompson estimator T under /?(s) must necessarily 
of the form (Rao, 1979) 



A 

where d- tj = Z d is d js p(s)-\ and V(Y) = 0, for y, proportional to known 

S3 ij 

constant w t . Denote d,, under uncontrolled plan/? 0 (s) as d,, 0 . 

Let S i be the sub set of non-desired combinations. To obtain an optimum 
controlled sampling plan we minimize the objective function 

<t>= Z p(s) 

seS\ 


with respect to variables {p(s ), s e 5} s.t. constraints 
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j) p(s) > 0, s e S, ii) I p(j)=l, iii) £ d is p(s)=l / = 1,..Af, iv) 

SG ^ 59 / 

Variance matched condition 

S d is d is p(s) =d° + 1 , i < j = i .. /v. 

est.var (L ) = - Z X (r/JJ / tt ) w^f/vv : (y t / Wj - y / / w ,■ ) 2 

Avadhani and Sukhatme (1967) also have extended the scope of 
controlled selection to unequal probability sampling designs and have devised a 
technique such that the probability of a non-preferred sample is kept as 
minimum as possible and at the same time any sample chosen by it provides an 
unbiased estimate of population total. 

Gupta, Nigam and Kumar (1982) and Kumar, Gupta and Nigam (1985) 
have presented methods based on BIBD to manipulate support of7tPS designs. 
Hedayat, Lin and Stufken (1989) have proposed a novel technique for the 
construction of 7tPS sampling designs by emptying boxes. The technique for 
construction consists of playing a game in which objects are removed fromA 
Boxes, n at a time and at most one from each box at a time. They have shown 
that the minimum support size overall7rPS sampling designs for given A, n and 
11 , i 2 , ■ ■zn, where z, = x,/X , does not exceed N. 

NUMERICAL EXAMPLES 

Example 5.1: The following data show the stratification of2000 households in 
a cluster of 10 villages according to cultivated land holding sizes and the 
average acres under summer paddy per holding size in each stratum along with 
standard deviation. 


Table 5 E.1 


Holding size 
(acres) 

Number of 
holdings N h 

Average acres under 
summer paddy {Y h ) 

Standard 
deviation ( S h ) 

0-2.5 

660 

1.5 

1.2 

2.5-5.0 

930 

3.7 

2.5 

Above 5 

410 

6.2 

4.2 


Calculate the sampling variance of the estimated area under summer paddy for 
the cluster of villages for a sample of 200 households, 

i) if the households are selected by the method of SRSWOR without 
stratification. 

ii) if the households are selected by the method of SRSWOR within each 
stratum and allocated in proportion to (a) number of households in each 
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stratum (Nh), (b) the product NhSh- 

Also, compute the gain in efficiency of stratified random sampling from 
two methods of allocation as compared with unstratified simple random 


sampling. 

Solution-. N = 2000, n = 200, Y= 3.4865. 


Strata 

w h 

m 

fllf ■ 

w h s h 

w,s„ 2 

Y h 

W„(Y h - Yf 

0-2.5 

0.33 

1.2 

1.44 

0.396 

0.4752 

1.5 

1.3022 

2.5-5 

0.465 

2.5 

6.25 

1.1625 

2.9062 

3.7 

0.0212 

Above 5 

0.205 

4.2 

17.64 

0.861 

3.6162 

6.2 

1.5094 





ZWhSh 

TtVhSh 2 

— 

z mn-Y) 2 





2.4195 

6.9976 


2.8328 


V(V S ,) P = (N- n)/N ZW h S h 2 = (0.0045) x (6.9976) = 0.031489 
nysdap, = VW h S h ) 2 ln - XIV h S h 2 /N 

= (2.4195) 2 /200 - 6.9976/2000 
= 0.029270 - 0.003499 = 0.025771 


V(y)ran = V(y) p + [0V - n)/Nn] ZfV, (Y h - Y) 2 


= 0.031489 + (0.0045) (2.8328) = 0.044237 
Gain in efficiency compared to unrestricted simple random sampling: 

i) Proportional allocation: 

/ \ 


ran 


-l 


v v. 


X 100=40.48% 


J 


ii) Optimum allocation: 

f \ 


v 


ran 


-l 


X 100= 7165% 


V Vopt J 


Example 5.2: A survey for estimating the cost of milk production was under 
taken by the Indian Council of Agricultural Research in Chennai during 
1957-59. All the stables of Madras were stratified on the basis of number of 
milch cattle into three size strata. A simple random sample of stables was 
selected from each of the three strata. The data given below give the milk yields 
of selected urban stables in winter, 1958-59. 

Estimate the total milk yield of Madras city in Winter 1958-59 and give its 
standard error. Also estimate the gain in precision due to stratification. 
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Table 5 E.2 


Strata Stable 

Total no. 

No. of 

Total milk yields of 

no. 

size 

of stables 

staples in 

the selected stables 



(A/„) 

samples (n h ) 

(100 litres) ( Y hj ) 






25 

0 

4 





14 

13 

9 

1 

One milch 



11 

16 

0 


animal 

3568 

16 

15 

5 

19 





0 

5 






9 

5 






7 

49 

24 





58 

30 

29 

2 

2-8 milch 

3209 

16 

19 

51 

15 


animals 



58 

13 

48 





20 

49 






66 

39 






200 

100 






96 

315 


3 

More than 

563 

10 

151 

125 



8 milch 



381 

253 



animals 



267 

176 



Solution: N = LN/, = 7340, n = Inn = 42 
= (N]/N) = 3568/7340 = 0.4861 
Wi = {N 2 !N) = 3209/7340 = 0.4372 
Wt, = (Aty/V) = 563/7340 = 0.0767 
= 150/16 = 9.375, >>2 = 575/16 = 35.9375 
>>3 = 2064/10 = 206.4 
5i 2 = 53.32,52 2 = 344.60, J 3 2 = 16136.40 
y.si = W\y\ + ^ 2^2 + HTjb = 36.1004 
The estimate of population total 

Y s , = Hyst = 264976.93 

K, = v 0<«) = Est. V(y,,) = 207,2 (1/n* - 1/%) **2 = 14.2306 

Est. S.E. (Ysi) = /W(Est. K(y,,)) = 27689.02 
1 ^ ^ ^ 


K = 

r P/1 11 


N - n 


ran 


n(N -1) 
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= (7340 - 42)/(42 x 7339)[4671.7264 - 1303.2388 + 14.2306] 
= (0.0236765) (3382.7182) = 80.09 
R E. = (!>,,>„) x 100 = (80.09/14.23) x 100 - 563%. 


Example 5.3: For the cost functions 


i) C — C 0 + SC/, rtf, 

ii) C = C 0 + £C/, V «/,, where C () and C/, are known costs, find the optimum*!/, by 
minimizing V(y s t) for fixed total cost with total sample size being 60, given the 
following data. 


Stratum 

N h 

S h 

C h 


1 2 3 

30 40 60 

1.5 2.0 3.5 

1 2 3 


4 

70 

4.0 

4 


Also compute optimum allocation for«/, oc N h Sh and compare the efficiency of 
optimum allocations with that of proportional allocation to estimate the 
population mean Y. 


Solution: Under the proportional allocation m xNh 
n\ =9, m= \2,m= 18, m = 21 

under the cost function (i) the optimum allocation becomes^/, ocN/, Sh/^Ch with 
optimum sample sizes: 

(opt) ~ 8, ni{opi) ~ 9, m( () pi) ~ 20, (opt) = 23. 

Under the cost function (ii) the optimum allocation becomes*?/, oc [AWWO,] 2 ' 3 
with optimum samples sizes: 

n'](opt) = 2, n'2(opi) ~ 8, n'3(opi) = 23, n'4(opt) = 27 

Under the optimum allocation n h oc N h S^ the optimum sample sizes are 

ft i(«/?0 ~ 4, /? 2 (opt) ~ 8, n"-}(opt) ~ 21, n"4{opt)= 27. 

Calculation of sampling variance and relative efficiencies 


Allocation 


Variance _Relative efficiency 


Proportional 

n h oc N h S h 
n h oc N h S h Nc h 


0.1215 

0.1057 

0.1124 


100 

115 

108 
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EXERCISES 

5.1 Discuss the behaviour of the ratios (i) Vprop (yst)IV(y) and (ii) V op t (ysi)IV{y) 
as sample size n increases. 

5.2 Derive the variance of the estimates of population proportion P from a 
stratified sample of size n under proportional and optimum allocations 
and obtain their unbiased estimates. Further, show that 

Vran = Vprop — EfVh (Ph— P)^ 

n 

and 

1 2 

Vprop = Vop, + - I W h yPhQH - I W h Vna ] 

5.3 Under the cost function C = Co + SC/,«/,, allocate the total sample size n to 
the strata for minimizing the variance of the variance estimator. How does 
it compare with the allocation obtained by minimizing the variance of the 
estimate? 

5.4 From the following data estimate the gain in precision due to stratification 
to estimate the population mean and also compare the result with the gain 
that would have been obtained from proportional allocation. 


Stratum 

boundaries 

N h 

w h 

y h 

CM 

-C 

CO 

n h 

11-15 

7850 

0.6012 

3.2 

20.4 

1250 

16-20 

4257 

0.2235 

8.5 

70.3 

600 

21-25 

1043 

0.1028 

15.2 

169.5 

250 

26-30 

350 

0.0725 

25.8 

411.2 

150 

N 

= 13,530 


n 

= 2250 


5.5 Derive optimum points of stratification of the hypothetical populations 
represented by distribution functions 

i)./(y) = 0 <y < oo, 

>OAy) =Al 2 ) with 1 d.f., 0 < x 2 < oo 

in case of (a) proportional allocation (b) Neyman allocation (c) Equal 
allocation under which equal number of units are sampled from each 
stratum. 

5.6 A sample has been obtained by sampling with probability propor- tionate 
to x with replacement. Estimate the variance of the estimator of 
population total under (i) simple random sampling (ii) stratified random 
sampling under stratification byx-values. (Konijn, 1973) 
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5.7 A population is stratified into K strata according to a size variable x, the 

value associated with the ith stratum being*/. Assume that yy - a + px, + 
e,)\ i = 1,.. L,j = 1, .. Ni, where E(ey\xi) = 0, V(eij\xi) = Denote Vp rop , 

Vopt and Vpps as the variances of the stratified proportionate, stratified 
optimum and unstratified with replacement pps estimate of the population 
mean. Prove that under the proposed model (i) E m (Vopd < Em(V pps ), (ii) 
E m { V pps ) < E m { Vprop) if p(x, x g ~ ') is greater than zero, provided that in - 
1 )IN is negligible relative to unity. 

5.8 Estimate the finite population variance from a given stratified sample and 
derive an expression for the variance of the estimate. 

5.9 Suppose in a stratified random sampling allocation is based on estimates 
sh of the Sh. That is, m = n(NhShl'ENhSh)- Denoting CV as the coefficient of 
variation of the Sh (assumed to be the same in all strata), show that the 
average increase in V(y st ) due to errors in the Sh is approximately 

V(ys.) - Vop, = lVN h S h ) 2 1 ] 

nN L 


5 . 1 0 A sample of n units is to be selected from a population having two strata of 
sizes N i and N 2 . How would you allocate total sample of siz en into two 
strata in order to estimate (a)_population mean_F, (b) the difference 
between two strata means Y\ - Y 2 , (c) the ratio Y\/Yi of two strata means, 
with smallest possible variance of respective estimates. 

5.11 Let m iy = n(NhShy)!Y.NhShy and rihx — n(NhShx)/^NhShx be optimum 
allocations based on y and * characters respectively. If the regression ofy 
on x is linear in all strata having the same regression coefficient and the 
coefficient of correlation between* andy in the Mh stratum p h > p > 0 for 

h — 1,..., L, show that the relative increase in variance due to using/?/!* in 
stead of m y does not exceed (1 - p) 2 / p 2 . 

5.12 Show that the following allocations in stratified sampling have the name 
variance: (a) m oc w, a sip (b) m oc wi 2 ~ a S , 2 ~P 

5.1^ In a stratified sampling derive minimise allocations which simaltaneously 

of population mean and the variance of the variance estimater for fired 
sample size. 


IS r °M Ped ! nt0 two strata and if< t> in the ratio of the actual 
A ) L s and e allocatlon show that whetaeverthe values of 

’’ 2 ’ S ' and the ra,1 ° ry,W(y,) is never less than 4 */(l + *)>. 



6. SYSTEMATIC SAMPLING 


6.1 Introduction 

Systematic sampling is one of the simplest of sampling procedures adopted in 
practice and is operationally more convenient than the simple random 
sampling. Suppose N units of the finite population are arranged in some order. 
Assume N = nk, where n is the size of the sample fixed in advance. The 
technique of the systematic sampling consists in choosing the unit say r th one 
at random from first k units and then selecting every &th unit thereafter. For 
example, if N = 24 and n = 4, then k = 6. Suppose the serial number of the unit 
selected at random from the first 6 units is 3. Thus, 3rd unit of the population is 
selected and consequently every 6th unit thereafter, that is, 9th, 15th and 21st. 
The random number r chosen from 1 to k is called random start and k is denoted 
as the sampling interval. This procedure of selecting a systematic sample with a 
random start is also equivalent to subdividing the whole population into k 
clusters, each of size n in a systematic manner and then choosing one oik 
clusters with equal probability. Systematic sample has resemblance with the 
corresponding stratified sample with one unit selected from each stratum where 
strata are formed with first k units, second k units and so on. But a marked 
difference can be noticed in the sense that in case of systematic sampling only 
one unit is selected at random from the first k units of the first stratum and the 
rest are selected by the predetermined pattern, which is not so in case of 
stratified random sampling. 

Apart from the simplicity as to its concept and execution, systematic 
sampling is likely to be more precise than the simple random sampling and even 
more precise than the stratified sampling under certain specific conditions. Its 
efficiency emerges from the fact that the systematic sample is spread more 
evenly over the population. Drawing of a systematic sample is less time 
consuming particularly when the drawing is made in the field. For sampling 
from stable populations where the units are in orderly sequences like houses in 
a street, files stored in computers, books arranged in shelves, plantation trees in 
a forest, persons standing in queue before a polling booth or petrol filling 
station or ration store and in certain other natural populations (Osborne, 1942) 
systematic sampling has been successfully used. Sukhatme, Panse and Sastry 
(1958) have demonstrated the use of systematic sampling in estimating catch of 
fish in India. One of the earliest applications of systematic sampling is its 
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application in forest surveys illustrated by Hasel (1942). The greatest reduction 
in variance in systematic sampling occurs when there is high correlation 
between adjacent units in the list as regards the measurable characteristics, the 
serial correlations decreasing as the interval between units increases. Such 
serial correlations usually occur for phenomena that vary in time. 

In spite of the advantage inherent in the systematic sampling procedure, 
there are also certain major disadvantages. The foremost one among these 
disadvantages is that it is not possible to estimate the sampling variance 
because one random cluster is only selected for the sample. There is also risk to 
use systematic sampling when there is periodicity in the population and the 
interval between sample elements is equal to the period or multiple of it. A 
disturbance in the theory of systematic sampling occurs when/V is not integral 
multiple of k, in which case different systematic samples from the same finite 
population may vary by one unit in size. However, this disturbance is likely to 
be negligible when the sample size n exceeds 50 and may be ignored for 
simplicity. 

The theory of systematic sampling has been dealt in detail by W.G. 
Madow and L.H. Madow (1944), L.H. Madow (1946), W.G. Madow (1948, 

1949, 1953), Yates (1948), Cochran (1946) and Das (1950). Critical reviews on 
systematic sampling are due to Buckland (1951), Iachan (1982), Bellhouse 
(1988) and Murthy and T.J. Rao (1988). Zinger (1964) has given a good 
account of systematic sampling in forestry. 


6.2 Selection of Systematic Samples 


Illustration 1: Let /V - 20, k - 5; the units are numbered from 1 to 20. Possible 
systemtic samples are: 


1 

II 

III 

IV 

V 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

Illustration 2: 

N = 18,* = 

5. The possible systematic samples 

are 


I 

1 

6 

11 

16 


II 

2 

7 

12 

17 


III 

3 

8 

13 

18 


IV 

4 

9 

14 


V 

5 

10 

15 


In Illustration 1, the possible systematic samples are of equal 


size, i.e.« = 
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4, but in Illustration 2, the possible systematic samples are of size either 4 or 3. 
However, as Cochran (1977) has observed that the effect of such disturbances 
arising out of the situation when TV ^ nk may be negligible if« exceeds 50 and 
may not be large even if n is small. 

The disadvantage of systematic sampling when/V * nk can be overcome 
by adopting a device known as circular systematic sampling where the units are 
supposed to be arranged in a circular fashion in contrast to the usual practice 
where the units are arranged in a linear fashion, that is ordered along a line, for 
which the name linear systematic sampling has been attached. The method of 
circular systematic sampling attributed to Lahiri, 1952 mentioned by Murthy 
(1967) is to choose a random start from 1 to N and then to select the unit 
corresponding to the random start and every kt h unit thereafter in a cyclical 
manner till a sample of n units is obtained where k is to chosen as an integer 
nearest to N/n. The circular method implies that the units are arranged as 1,2,.. 

., N, 1, 2,.. . that is, the units of the finite population are written in some order 
and then the same order is repeated to facilitate the draw of everyth unit till n 
units are selected for the sample. 

Another method which may be applied to overcome difficulties in 
systematic sampling when A * nk is to use N/n itself as k without rounding it 
off. Draw a random number/? from the uniform distribution U( 0, k). The hth 
unit in the sample is selected if h -1 <R+ (J- 1) k <hj= 1, 2,..., n. Any unit 
h will have inclusion probability nh = 1 Ik = n/N and every possible sample will 
be exactly of size n. For every h * t e U, nht = Mk if (h, t ) belongs to same 
sample V and zero otherwise. 

6.3 Estimate of Population Mean and Its Sampling Variance 

Case I: N = nk. The composition of k systematic samples may be written down 
as follows: 

Sample number 


1 

2 

/ 

k 

1 

2 

/' 

k 


1+ k 

2 + k 

/ + k 

2k 

Serial number of units 




written row wise 1 + (j — 1 )k 

2 + (/ - 1 )/c 

i+(j-Vk 

jk 

1+ (/7 — 1)/c 

2 + (n - 1)/c 

/'+(/?-1 )k 

nk 


Denote Y Xi as the value associated withyth unit in the/'th systematic sample, / =1, 
2,. .., k and j - 1,2Define 



and Y = — ZZ Y„. 
nk 
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If Ah systematic sample is chosen, the estimate of the population mean is 
given by 

i n 
rr 1 


y„ =>'/■=- 

"i-i 


We find E(y, y ) = E(Y,) = (1 Ik) showing there by that y, y is an unbiased 
estimate of Y. Further, 

^=£(5v-n 2 =^£ov-n 2 

K i=\ 


Case II. N=nk + r,r<k\ The systematic samples will be of unequal sizes that 
is, either n or n + 1. Therefore, ysy would clearly be a biased estimate. However, 
an unbiased estimate with ith systematic sample is 


ysy 


k ^ 




where «'stands for number of units in the/th systematic sample which may be 
either n or n + 1. 


]. 1 k ( n' ^ i it ( n’ 


N kfr\ V yTi 




i w=i 


=—=F 

N 


(It can be easily checked that each unit of the population occurs once in only 
one of the k possible samples) 



6.4 Circular Systematic Sampling (CSS) 

Let k be the integer nearest to N/n. Select a random number, say r, from 1 to N. 
The required sample consists of the units corresponding to numbers. 

r +jk if r + jk < N 

r+jk-N if r+jk > N\j = 0, 1, 2,1 

Illustration 3 

N = 7; n = 3. 

Take k = 2. The 7 possible samples are: 
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Circular systematic samples of 3 units from 7 units. 


Random 

Sample composition 
with serial numbers 

Probability 

Sample mean 

Ycsy 

1 

1,3,5 

1/7 

(Vi + v 3 + Ys)/3 

2 

2,4,6 

1/7 

(V 2 + V 4 + Ye)/ 3 

3 

3, 5, 7 

1/7 

(V 3 + V 5 + Yi)/ 3 

4 

4, 6, 1 

1/7 

(V 4 + Y 6 + Y,)/ 3 

5 

5, 7,2 

1/7 

(Y 5 + Y 7 + Y 2 )/3 

6 

6, 1,3 

1/7 

(Ye + Yt + Y 3 )/3 

7 

7,2,4 

1/7 

(Y 7 + Y 2 + Y 4 )/3 


It may be verified that the expected value of CSS mean is equal tof, showing 
thereby that jv V( is an unbiased estimate of population mean. 

In general there are N possible samples corresponding to each random 
start r (= 1,.. N). Each sample has the same probability of selection 1 IN. Each 
unit has the inclusion probability «/7V of being selected into the sample because 
it may come up in the sample either in 1st position or 2nd position,. . . or«th 
position. 

1 N 

E(y csv ) = — Z (sample mean corresponding torth random start) 

N r = 1 


N 


N 1 yv 

Z 3v=^Z 

/• = i Nn r = , 



r<y m )=±£(y, 

N 7?\ 



6.5 Comparison of Systematic Sampling with Simple Random 
Sampling (N = nk) 

Write (/V-1)S 2 = Z £v,-r? 

i-l J = l 

=«il(F 1 -F2)2 + ZZ(^-F t )J 

— nk V(y.sy) + k(n — 1) wsy, 

where the coefficient^- 1) stands for the degrees of freedom of ZZ (Y,j - Y,) 2 
in the break up of sum of squares in the analysis of variance. 

Theorem 6.1: y sy is more precise than simple random sampling meany, ifS^w^ 
>S 2 . 

Proof. For SRSWOR V(y) = [(N-n)/Nn] S 2 . By simple algebra it may be 
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seen that V(y sy ) < V(y) if S 2 W sy> S 2 . 

Theorem 6.2 


ny sy )= 


n 


N-V 

. N > 


[l+(«-l)p w ] 


where p w is the correlation coefficient between pairs of units in the same 
systematic sample. 


Proof: Now, V{y sy ) = 





-Wu-y) 

n 


1 

kn 2 


Il(r, r r ) 2 + Z I (fy-f )<Jiu-v) 

i j * ./*« 


= _L[(tv_ i)S 2 +(yv-i)(«-i)p w 5 2 ] 

kn 2 


=—iyjv] n+ («-1) pw], 
n 

u E(Y,j-Y)(Y : „-Y) 

where p* =-=- 

EiYij-Yf 

I L 0 , s -F)()' I '„-F) 

_ i j*u 

kn(n-\)H(Yij -Y ) 2 
I I (Y ij -Y)(Y iu -Y) 

__ i j * u 

(yv-i)(«-i)5 2 


Remarks : The relative efficiency of systematic sampling compared to simple 
random sampling is dependent on the intra-class correlation coefficientp w ,as 


R E. ( ysy ) = 


n(k - 1 ) 


Therefore, if p w = -\/(nk- 1), both the systematic sampling and SRSWOR are 
equally efficient. In case p w < -\/(nk - 1), the systematic sampling is more 
precise than SRSWOR. The minimum possible value of p w is -\/(n - 1), in 
which case V(y, y ) is zero. If p w = 1, R.E. equals (k - 1 )!{nk- 1). 
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6.6 Comparison of Systematic Sampling with Stratified 
Random Sampling 


Suppose the population is divided into n strata formed as the first k units, the 
second k units and so on and one unit at random is selected from each stratum. 
Then 

v(.y„) = ^ T l.UY ll -Y j f 

kn 1 i j 


S-1 


n 




1 

fi-i] 

s 2 , = 


n 

^ k) 

WSt 

Nn 


C2 
^ WSt 


where S 2 wsl is the pooled mean square between units within rows (strata). 
Defining 


Pw'.vf 


_ ij ~ Yj )(Yju ~ Yu ) 

Wu-y]) 2 

2 * y (.Yv-rj){riu-Yu) 

n{n-\){k-\)t\ jtu Kst 


it can be shown that V(y, y ) = ( S 2 wxl /n ) [(N - n)/N] [1 + {n - 1) p w . v ,]. 

Thus, a systematic sampling has the same precision as that of the 
corresponding stratified random sampling with one unit per stratum ifp w . v/ = 0. 
More details about the precision of systematic sampling are studied by W.G. 
Madow and L.H. Madow (1944). 


6.7 Performance of Systematic Sampling for Some Specific Populations 
(a) Population in random order 

If the units of the population are arranged serially at random (for example in 
case of sampling from human populations if the names are arranged 
alphabetically by surnames and the character measured has no relation to the 
surname of the individual) one would intuitively expect the systematic 
sampling to be equivalent to simple random sampling with the same variance, 
although for a single finite population such equivalence may not exactly be 
true. In this connection two important results due to Madow and Madow (1944) 
are as follows: 

1- Let N\ permutations of Y\,. .Lv form N\ finite populations. Averaging 
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variance over all these finite populations £( V xy ) = V,an • 

2. Assume the given finite population to be a random sample from a super 
population. Thus, Y\,...,Yn are treated as random variables, with E m {Y) = 

E m {Yi-\x){Yj -p)=0(iVy)and£ m (f,-p) 2 =oj. 

Then E m (V xy ) = E n (V ran ) = E m {V st ) = [(N-n)/Nn\ Sa, 2 //V where E m stands 
for averaging over all finite populations that can be drawn from the 
assumed super population. 

(b) Population with Linear Trend 

Suppose Yi = a + bi, (/ = 1, . . AO, denoting that there is linear trend in the 
population. By simple algebra it can be shown that 

Vran = Kk-\)(N+\)/\2]b 2 
Vsi = [{k 2 - 1)/12«] b 2 

Vsy = [{k 2 - 1)/12] b 2 

Thus, Vst < V sy << Vran, that is, the stratified random sampling is the most 
efficient one when compared with systematic sampling and srs in the presence 
of linear trend and the efficiency increases with increase in sample size. 
However, the efficiency of systematic sampling in comparison to stratified 
random sampling can be improved by applying the end corrections as proposed 
by Yates (1948), when N = nk. Bellhouse and Rao (1975) have applied end 
corrections in the presence of linear trend for the case of circular systematic 
sampling when N * nk. 

An alternative method to remove the effect of linear trend is to select the 
sample in such a way that the sample mean is unaffected by the linear trend. 
This may be achieved by the following three methods of sample selection, 
when N = nk, These are: 

i. Method of centered systematic sampling by Madow (1953) 

ii. Method of balanced systematic sampling considered by Sethi (1965). 

iii. Modified systematic sampling method of Singh, Jindal and Garg (1968). 

Performances of these methods have been studied by Bellhouse and Rao 
(1975) under superpopulation model representing linear, parabolic and periodic 
trends and autocorrelated variations. 

(c) Population with Periodic Variation 

If the population consists of a periodic trend, for example a sine curve (say) the 
performance of the systematic sample depends on the value of*. In case’the 
sample points are such that* is equal to the period of the sine curve or integral 
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multiple of the period, the systematic sample provides a poor result in the sense 
that every observation of the sample is the same being the repeated one and 
hence its efficiency is no more than that of a single observation. Most 
favourable sample points occur when k is an odd multiple of half period, 
making every systematic sample mean exactly equal to the population mean. 
The sampling variance of the systematic sample mean in this situation becomes 
zero. Between two extreme situations mentioned above the systematic sample 
has various degrees of effectiveness. 

(d) Auto-correlated Populations 

It is generally observed that in many natural populations two observations^, 
and y t are found to be more alike when they are close together than when they 
are far apart. Assuming thaty, and yj are positively correlated we may postulate 
a mathematical model where the correlation is a function of their distance| i—j \ 
and diminishes with the increase in the distance. Populations having such 
characteristics are called autocorrelated populations. As the correlogram, the 
curve depicting the relationship between the serial correlation coefficient p u 
and the number of units separating a pair of units (w) may not be a smooth 
function because of finite nature of population, the comparison of systematic 
sampling with its competitors stratified random sampling and simple random 
sampling can be made over a whole series of finite population which are drawn 
from an infinite super population following the model postulated above. Thus 
we assume that Y\, Y 2 ,.. .,Yn are drawn from a super population having£ ni (y,) = 

p, £,„( Y, - p) 2 = a 2 , E m (Y, - p) {Yj u - p) = p« a 2 , where p«j> p v > 0 whenever u < 
v, The average variance under the model is E m E p (y sy -Y ) 2 . It would be easily 
seen that under the model stratified random sampling is more efficient than 
simple random sampling to estimate the population mean, but no general 
conclusion can be established about systematic sampling. 

Cochran (1946) has proved that if in addition to conditions stated in the 
above model 

S 2 « = p„+i + p w 1 - 2 p u > 0; w = 2, 3 ,..kn - 2, then 

E m V.y y < E m V s i < E m Kan for any size of sample. 

Quenouille (1949) has shown that Cochran’s (1946) results remain valid 
if E„,( Yj) = p, and E ni {Y, - p,) 2 = a, 2 . 

6.8 Estimation of Sampling Variance 

Systematic sampling has a major drawback in the sense that no estimate of 
sampling variance can be obtained as the systematic sample is a simple random 
sample of one cluster only. In such a situations it is a common practice to regard 
the systematic sample as a random sample and estimate the sampling variance 
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using simple random sample formula. When the assumptions as to random 
sample are far from reality, the estimator may be a badly biased one. Several 
alternative approximate estimators of variance of systematic sample mean have 
been presented in statistical literature which should be used in practice with 
caution, keeping in view the nature and ordering of the target population 
(Wolter, 1985). 

Regarding the systematic sample as if it were a simple random sample oh? 
units, the estimate of variance may be given by 

vi = (1 In - 1 Ink) Or 2 , 


1 " 

where s' r = - -Yiyij-y,? ■ 

«->/=! 


Now, 




f\ 

\n 


_n f yv-i 

nk) N 


S 2 ( 1-Pw), 


does not equal V(y.sy), unless p w = -M{nk - 1 ), in which case the systematic 
sample behaves like a simple random sample of the population. 

Another approximation is obtained by regarding the systematic sample as 
a stratified random sample with 2 units selected from each successive stratum 
consisting of first 2k units, second 2k units and so on. Thus we have an 
estimator 


v 2 





1 N n/2 d 2 . 

'Z ,,J 

7 = 1 


n 


n 


5 


where d,, = y,, - y, tl /. 

Another estimator of above type is 



\nJ /=2 2(«-l) 


Yates (1948) has suggested a number of approximate estimators based on 
higher order contrasts than those are present in V 2 and V 3 . Some of these are 


n 


v 4 = 1 -- 

l N 


(\ 


b 2 . 


\nJ f^2,b{n-2) 


where b,, = Ad,, = A 2 y,j = y,j - 2y u -\ + y l{ , _ 2 ,and 




c 2 . 

'-7 


\nJ /rs3.5(«-4) 
where C„ = 14 A 4 y,, + A 2 y, l/ -i 

= y,,/2 -y,j~ i + y,j -2 - yij. 3 +yij- a/2 and so on. 
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A general class of estimators which looks like random group estimator is 
formed by splitting the parent sample into equal sized systematic sub-samples. 
Foi example, if/? and nip are integers, we may compute y a as the sample mean 
of the ath systematic sub-sample of size nip, written as 




PU- l) + a 


Thus, the variance estimator is 


V 'K(vv) ~ 




v NJ p(p-\)£z 


Z(Ta -y S y ) 2 


Koop (1971) has investigated the bias of for p = 2 in terms of the intraclass 
correlation coefficient. 

Cochran (1964) has suggested a variance estimator taking into account the 
estimate of intraclass correlation coefficientp w , given by 

v t = (1 - n/N) (. s 2 /n ) [1 + (2/log pj + 2/(p vv ~ 1 - 1)], ifp H . > 0, 

= (1 - n/N) s 2 ln if p u . < 0, 

where 


p _ v (yy ysy){yu-\ -y sy ) 


7 = 2 




Zinger (1980) considers a weighted estimator of population mean by 
taking a linear combination of estimator based on a systematic sample of size/i 
and another srs estimator based on a supplementary sample of size/ 7 f drawn 

without replacement from the remaining/V- n units, given by the weighted 
estimator 


M*v) ~ (1 - u) ys> + cty.vn, 0 < a < 1. 

Zinger (1980) and Wu (1981) have discussed unbiased estimators of 
variance of y w( .w). 

Instead of choosing a systematic sample with a single random start one 
may consider/? systematic samples generated by p random starts corresponding 
to p random integers which are selected at random between 1 and A:. Such a 
systematic sampling is called multiple start systematic sampling. Assumed = 
nk. Now, corresponding to p systematic samples of size «, denote (P = 1,2,.. 

p) as the p systematic sampling means. The combined estimator of the 
population mean is 


l p 

y» = - Z 

P P = l 

The unbaised variance estimator is given by 
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^(-KsV ) HV* 


Zu - 4) 2 


/>(/>-!) p=i 

if p random starts are selected with replacement and is given by 


^(Tsv ) wor 


f 


n 


\ 


1 - — 

V NJ 


1 


K^p -y*sy? 


pip- 1) p 


if p random starts are selected without replacement. 

Wolter (1984,85) has made extensive theoretical and Monte Carlo studies 
on the biases and efficiencies of different variance estimators using five 
mathematical models-random model, linear trend model, periodic population 
model, stratification effects model and autocorrelated model. 


6.9 Systematic Sampling with Unequal Probability of Selection 


Arrange /V units of the population in a list. Let Y, and Xi denote the value of 

/ 

study and auxiliary variable respectively for the/th unit. Define A/, = X j as 

./=i- 

the cumulative measure of size. A systematic sample of size n with selection 

N 

interval I = ^ X/ / n = X / n is selected as follows: Select a random number/? 

i'=i 

greater than 0 and less than or equal to/. Then n selected numbers are/?, /? + /,/? 
+ 2/,...,/? + (n - 1) /, which identify the population units to come into the 
sample. The /th unit in the list is selected, if it is the first unit for which the 
cumulative size M, is greater than or equal to the selected number. The 
probability of including the /th unit in the sample is equal to7 n=Xi/I = nXJX = 
np ,, which shows that the systematic PPS sampling is a7rPS sampling scheme. 

Approximate variance estimators can be obtained by using Hartley and 
Rao’s (1962) approximations to 7i,y given by 

n -1 n-\ , .n-1 

Kij = -K/7C/- +—-71,-71,(71 ( +71/) -— 71,71 • > 71 

n n z T=\ 


which are correct up to 0(/V~ 3 ) and are calculated with the assumptions that 
listing of units is at random and7i, is of order 0(N- 1). Using Yates and Grundy 
form, the variance estimator of the population total to 0(V~ ') is given by 

f ■■ -2 \/ \ 2 


PPS(sy) 


1 


I I 

n -1 / ,/<i 


n n 


/=! * 


\ 

( 

\ 


li 

-Zl 

) 


n jj 


Treating the sample as if it were a PPS WR, we have another estimator of 
sampling variance of the estimate of population total, given by 
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,( 2 ) 


1 


yi 


” S * ) n(n- l)frU 
Other variance estimators are: 


-Y 


/ 3 ) 
PPSisy ) 




n/2 f •• 


yn -V2/-1 


«‘ i=\\P2i Pli-\ ) 


treating the sample as if two units are selected from each ofn/2 equal sized 
strata 


v 


( 4 ) 

PPS(sy) 


1 

2 n(n - 1 ) 



r*. 

yi- 1 1 

. p, 

Pi -1 > 


which utilizes overlapping differences unlikev^ 3 ^v(.vy). 

1 p 


v 


( 5 ) 

PPS(sy) 


P(P~ l)a = l 


S (I'a-f') 2 , 


where Y a is the HT estimate of total based on the ath systematic sub-sample (a 

= 1- ,p ). 


NUMERICAL EXAMPLES 

Example 6.1: Given in Table 6E.1 are data on 6 systematic samples of size 4 
from a population of 24 units. 

Compute the relative efficiency of systematic sampling over unrestricted 
random sampling to estimate population total. 

Table 6 E.1 


1 


2 

2 

1 

3 


Systematic sample numbers 


2 3 4 5 6 


1 3 2 4 3 

4 5 13 1 

3 2 5 2 4 

110 4 2 


Solution. N 24, k 6 , n 4. Let yy be the value ofyth unit in the ith 
systematic sample (/ = 1,..., 6 ). The systematic sample totals are 

= 8,^2 = 9, yi = 11, y\ — 8 , 

vs. = 13 and y^. = 10 . 
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The estimate of population total Y by systematic sampling is given by 

A 

Ysy = Ny sy — N y,-. 

ViYsy) = N 2 V{ysy\ 

Now, 

V(y„)=jLbi-y ) 2 

= 1 /n 2 k [Lyi 2 - Y l lk\ 

= 1/96 [599-59 2 /6] = 0.1961 
Therefore, V(Y, y ) = /V 2 V(y, y ) = 24 2 (0.1961) = 112.95. 

In case of simple random sampling the variance of the estimate of 
population total is 

V(Ny) = N 2 (\/n- \/N)& 

= [N(N-n)/n(N- 1)] [Zy 2 ,j - Y 1 /^] 

= (5.2174) (43.96) = 229.36 

Relative efficiency of systematic sampling over simple random sampling is 
R.E. = (229.36)/(l 12.95) x 100 = 203%. 


Example 6.2: Given below a systematic sample of size 15 with sampling 
interval of 14 days, selected from 210 daily milk yield records of the first 
lactation of a cow belonging to Tharparkar herd maintained at a Cattle farm in 
Bihar (India). The milk yield of first five days was not recorded being for the 
colostrum period. 

Give an estimate of total milk yield for the period under study and suggest 
an estimate of standard error. Daily milk yields in litres for the sample days: 9, 
16, 12, 14, 23, 17, 20,21, 16, 16, 17, 19, 15, 8,4. 

Solution". /V = 210, n= 15andA = 14 

An unbiased estimate of the total milk yields 

r=-f> = ( 2 %)(227) =3178 litres 

n i= i 


An approximate estimate of variance is given by 


V(Y)= 


N 2 {k-\) 
2n k(n -1) 


n -1 


I 


(Tui -T /) 2 
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210 2 (14-1) 

2x 15 x 14 x (15-1) 


x 307 


= 29,932.5 (litre) 2 

Est. S.E. (Y) = V29,932.5 = 173.01 litres. 


EXERCISES 

6.1 Given below the data for 10 systematic samples of size 4 from the 
population of 40 units. 


1 

2 

Systematic sample numbers 

3 4 5 6 7 

8 

9 

10 

0 

1 

2 

1 

4 

5 

2 

4 

7 

8 

7 

8 

9 

12 

10 

13 

15 

6 

8 

30 

15 

17 

21 

20 

21 

24 

16 

15 

28 

30 

25 

29 

6 

29 

11 

32 

8 

12 

23 

35 


i) Compute the relative efficiency of systematic sampling over random 
sampling to estimate the population mean. 

ii) Draw five circular systematic samples of size 10 from the above 
population and compute the combined estimate of population mean from 
the individual sample estimates. Calculate the estimate of standard error 
of the combined estimate. 

6.2 From a region consisting of 300 large farms, a systematic sample of 30 
farms were selected to estimate area under mixed cropping (Table 6.1). 

i) . Estimate total area under mixed cropping in the region. 

ii) . Estimate the standard error of the estimate using the method of 
successive differences and compare it with standard error obtained on the 
assumption of SRSWOR. 

Table 6.1 Data on area (y) under mixed cropping for 30 sampled farms. 


Farm 

y 

Farm 

y 

Farm 

y 

1 

0.1 

11 

1.2 

21 

1.5 

2 

0.5 

12 

1.6 

22 

1.0 

3 

0.8 

13 

2.3 

23 

2.4 

4 

1.2 

14 

1.8 

24 

2.6 

5 

0.7 

15 

2.4 

25 

1.8 

6 

1.0 

16 

0.9 

26 

0.9 

7 

0.8 

17 

0.3 

27 

2.4 

8 

0.2 

18 

1.2 

28 

4.2 

9 

0.6 

19 

0.6 

29 

2.6 

10 

0.9 

20 

0.8 

30 

1.6 
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(y) in hectares: 1 hectare = 2.471 acres 

6.3 Under the super population model 

o. Em= p, v(Yj) =a, 2 , cov m yn =o w * o 
ii). Em = a + bi, V(Yi) = CJ 2 , 

cov (K, K,') = 0 (/*/') 


derive the expressions for the expected variances of the sample mean 
based on samples of size n selected (a) systematically (b) with SRS WOR, 
assuming N = n k. 

6.4 Compare the efficiencies of SRSWOR, stratified random sampling and 
systematic sampling to estimate the population mean when there is 
quadratic trend in the population, that is, f, = a + b, + c, 2 

for the cases (i) a = b = c = 1 
ii) a = b = 0. 

6.5 Show that the variance of y sy the estimator of population mean can be 
expressed as 


r(y,y)= 


nk-\ S 2 
nk n 


n -1 


1 + ~ z ("-°0p 


«a = l 


a 


where 


Pa 


£(jy-y.. Kym-a-y,. ) 
E(yij -y ) 2 


being the non-circular serial correlation coefficient for log/fca. (Madow, 

1944). 

6.6 In a stratified sampling if systematic samples are selected from strata, 
derive expressions for variance of the estimate of population mean under 
proportipnal and optimum allocation. Suggest a suitable method to 
estimate gain in precision from a given stratified systematic sample? 

6.7 Consider following three method of constructing a systematic sampling 
design. 

(a) Method of multiple distance for a linearly ordered population. Choose 

a random starting unit r between 1 to N and then systematically make in 
the clockwise direction at steps of *i, h,. \ where ki is a set of fixed 

positive integers. The sample consists of distinct units having serial 
numbers r, r + k\, r + k\+ k 2 ,. . r + (k\ + k 2 + .. . + k n _ i), reduced to 
mod(A0 with 0 being identified as N. 

(b) Method of multiple start . In the first stage draw a systematic sampler 
of size n\ from N units, in the second stage a systematic sampler of size 
m from the'remaining N-n units and so on. This process is continued 
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upto zth stage till n\ + m + ...«/ = w (the prefixed size) is achieved. The 
ultimate sample consists of sub-samples si, S 2 ,. .J/. 

(c) Method using mixture of sampling designs . The total sample of size n 
is split into two parts, n\ and n - n\, A systematic sample of size «i is 
selected from the population consisting of N units and then a sample of 
size n - n\ is drawn from the remaining N - n\ units following any 
probability sampling design. 

Suggest estimates of total under above three procedures and unbiased 
estimates of their variances. 

6.8. Consider the following super population model: 

E m (Yi\xi) = a+ bxi 
Vm =( Y i k ) = tf 2 *? 
cov„, (Y h Yj I Xi, Xj) = 0, (/ *j) 

Suppose the units in the population are arranged in increasing order of 
x- values. Denoting by V\, Vi respectively the variance of the estimator 
under SRSWOR and systematic sampling, compare them under the 
model. 



7. CLUSTER AND MULTI-STAGE 
SAMPLING 


7.1 Introduction 

Population, in general, may be visualized as consisting of hierarchy of sampling 
units of varying sizes and types. For example, to conduct a socio-economic 
survey a population may be regarded as composed of administrative units such 
as cities or districts or blocks or even other groups like villages or households. 
The ultimate units of enquiry may not always be the sampling units because of 
non-availability of satisfactory sampling frame and due to other reasons like 
desire to lower the field costs and to complete the survey operations easily and 
with less time. Thus, cluster may be defined as an aggregate of units of enquiry 
and also need not necessarily be natural aggregates like cities, districts, schools, 
constituencies and villages. In area sampling one sometimes takes recourse to 
artificial clustering by imposing grids on maps. If a sample of clusters is 
selected through probability sampling, such a sampling method is called cluster 
sampling. Also, if constraints are to be put on budget and time and moreover 
administrative requirements demand it, the sampled clusters may be further 
sampled for the selection of the ultimate units for enquiry and such a method of 
selection is termed as two stage sampling. 

Cluster sampling may be alternatively termed as single stage or uni stage 
cluster sampling. In two stage sampling a probability sample of clusters or 
non-overlapping sub-populations into which the population is sub divided [also 
called first stage units or primary sampling units] is drawn. This is called/irs/ 
stage sample. The first stage units (fsu) or primary sampling units (psu) are 
further sub-divided into second stage units (ssu) or secondary sampling units 
which may be either elements or clusters depending on sampler’s convenience 
and other requirements. Then a probability sample ofssu is drawn in the second 
stage of sampling from each psu selected at the first stage. Two stage sampling 
may be extended to three or more stages of sampling, termed as multi-stage 
sampling. For example, to estimate the yield per acre of paddy in a region, we 
may adopt a four stage sampling design where the districts, blocks, villages and 
fields are 1st, 2nd, 3rd and 4th stage sampling units respectively. 

Reduction in cost of the survey in case of cluster sampling occurs because 
it is easier, faster and cheaper to collect information from contiguous units 
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rather than from units which are scattered over the region. However, cluster 
sampling is generally less efficient than the sampling of individual units of 
enquiry from the point of view of sampling variance, because the latter 
procedure is likely to provide a representative cross-section of the population 
compared to the former as the units in a cluster have a general tendency to 
become similar. As it is cheaper and more convenient operationally to deal with 
clusters, the reduction in cost in this regard may likely compensate the loss in 
efficiency in cluster sampling, The theory of probability sampling applied to 
sampling of individual units may be applied in a straight forward manner to 
cluster sampling with the only requirement that the population is divided into 
non-overlapping clusters for which the sampling frame is available. 


7.2 Cluster Sampling 


Let the finite population U of elementary units be partitioned to N clusters 
denoted by U\ L ,. . . , W. A probability sample s\ is selected according to a 
design p(s\). Let s \ be of size n (fixed). Every element of clusters in the sample 
.vi is observed. Ifs denotes the set of elements that are observed, then.v= U Uf. 

i € 5 , 

The size of s in n c = X Mi where M, is the number of elements in zth cluster Uf, 

i es, 

/= 1. 

The inclusion probabilities for the zth cluster to be included in the sample 
and (z',y')th pair of clusters to be included in the sample are respectively given by 


Tti« =X />(■*! ) and7 t;/= 

s, 3 i s,3i, j 

Denote Y,, as the value of the characteristic (y) under study for the j the 
element of the zth cluster. The population total Y may be expressed as 

N M, n 


i=\ ./ = ! 


i'=l 


Y,. is the total for the zth cluster. 

Thus, the n-estimator or Horvitz-Thompson estimator of population total 
is written as 

- Y A 

Yc(HT) = X — = X th where l > = IViHi 

' ei i7t i( - , = 1 


V{Yc{HT)) = I£Di ,7 tj tj, where £> )y = 7ti, n\, - nuj. 

A. 

An unbiased variance estimator is Est V(Y C (H'jj) = ZE (Di ,j/n\ ,y) U t>. 
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7.2.1 Simple Random Cluster Sampling (Equal-sized Clusters) 
N clusters are of equal sizes, that is, M\ = M 2 = . .. = Mn = M (say) 


_ ^ N M 

The population mean Y = - Y Y Yy 

NM £1 ; = 1 

The population total Y= X Z Yy 

‘ j 

For simple fandom sampling of n clusters, an unbiased estimator of 
population mean Y is given by 

yc=-£y,., 

«,=1 


where 


_ 1 M 

r '-uT.> 

M j, 1 


Yy being the value of/th observation in the zth cluster in the sample. 


where 


S /, 2 = 


r 1 ' 
<N-b 


HYi. - Y) 1 


Est. V(y c ) = v(p c ) = f- - 77 I Sk 2 , 

\n N J 

where sb 2 = — Y (F, - ) 2 and E{sb 2 ) = & 2 . 


Efficiency of Cluster Sampling 

V M _ _ _ 

Write £ £ (fy - T) 2 = ZZ(K, - T) 2 + Z2(T y - T,) 2 
/=i 7 = 1 

That is, {NM - \)S 2 = {N- \)MS b 2 + N(M- \)S W 2 , 


where S £ -7,. )2 


To make a comparison of estimator y c with phased on a SRSWOR of nM 
elements from NM elements in the population, we have the relative efficiency 
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But 


RE = 


1 

nM 


f 1 

<n 


NM J 



S 2 
MSI 


N -1 N -1 


Therefore, RE =-, 

N-\ 


where 5=1- 2 /S l is the coefficient of homogeneity (Samdal, Swensson and 
Wretman, 1992) and SwVS 2 is the ratio of within cluster variance to total 
variance in the population. 

For large N, RE =-!-. 

l + (M -1)5 


5 is positive or negative depending on whether pooled within cluster variance is 
smaller than or greater than total variance in the population. 

It may be seen that - (A- 1 )IN(M- 1) < 5 < 1, or approximately - 1 /(M- 
I) < 5 < 1. 

If 5 is large, it signifies that the elements in the same cluster are similar, 
whereas a small value of 5 means that the elements in the same cluster are 
dissimilar with respect to the study variable. 

An alternative expression for V(y c ): 


Wx) = 


f\ 

\n 



SS 



(\ 

\n 


i i nm -\ s7 
n)m n -i 

1 ) 1 NM-\ s2 
N J M N- 1 

1 \ 1 NM-I S 2 
NJM N-\ M 


] N(M-l) Sl 
NM -1 S 2 

M 


[1+ (M - 1)5], 


by approximating N/(NM- 1) to N/NM and putting Sw 2 /^ 1 =1-5, 

Now V(yJ) < V(y) according as 5 < - \I(NM - 1). Empirical evidences 
show that 6 is usually positive and decreases as M increases. Large value of M 
outweighs the decrease in 5 and thus results in substantial increase in the 
variance of y c . 

The finite population variance S 2 can be estimated from a cluster sample 
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of nM elements as 


5 2 


{N -\)Ms 2 +N{M -1)5 2 
~NM -1 


, where s 2 ^-^TS 2 
n i= i 


The relative efficiency may be estimated by estimatingd as 5 = 1- (s 2 / S 2 ). 

Variance and Cost Functions 

As the variation between clusters is influenced by the size ofM, it is necessaiy 
to search for a model expressing Sb 2 as a function of M so as to enable us to 
predict the value of Sb 2 for given M. This can be achieved by investigating a 
general law to predict S w 2 for M and use this to arrive at a predicting formula for 
Sb 2 . lessen (1942), Mahalanobis (1944) and Hendricks (1944) have developed 
an empirical formula through conduct of many agricultural surveys relating?* 2 
to M as Svr = A MS (g > 0), where A and g are constants that do not depend on M. 
As it happens in practice ,g is usually very small andS* 2 increases steadily as M 
increases. Thus, 

2 _ - 1)6- 2 - - 1)|S-2 

^ i r -« 



(JM-l)S 2 -N{M-\)AMz 
N- 1 


= MS 2 -(M- 1 )AM 

lessen (1942) has developed a cost function from farm surveys of the form 
C = Ci Mn + Ci V/7, where C\ stands for the cost of the interview and the cost of 
travel from element to element within a cluster andC 2 ^« measures the cost of 
travel between the clusters. 

The optimum value of M can be obtained by minimizing 
V(y c ) = {S 2 - (M- \) A 

for fixed C and the minimizing equation for M after some simplifications is 
given by 

AM*-' I)] _ 1 ( u 4CC,A/ 

S 1 -(M-l)AMs-< { Cl , 

An explicit solution for M seems difficult and hence trial and error method may 
be adopted to get an optimum value ofM. An optimum value of /7 is obtained 
from the minimizing equation for/?, given by 

k _-C 2 +(C 2 2 +4C 1 CM)* 


2 C X M 
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by substituting the value ofM, so obtained. 

The readers, interested for details, may refer to Cochran (1977) and 
Sukhatme et al. (1984) for details. 


7.2.2 Unequal Clusters 

_ i n _ i Mi _ 

Define M = -1 M ,,, Y t = — £ Y tJ ,U (J )Y ij , 

N Mij 


Hence, 


U L = ( M{Y l )/M and V t - Mil M 




V = 


i'=l 

f 


/=1 




Estimators of Population Mean Under Different Sampling 
Schemes 

a) Simple Random Sampling without Replacement 

i) Simple mean of cluster means y c = ( — ] Z Y L = (1/ «)D y L 

\nJ ies 

ii) Weighted mean of cluster means 

yew = - Z U L = (1/ n )Im l , say 

n ies 

iii) Ratio estimator y cr = u / v, where v = (1 /«)S v, 

A 

iv) Regression estimator y C rn = m + P (1- v), where 

p=*«' = 

i. 2 

yc is a biased estimator of and the bias is given by 
Bias (y c ) = - (1 INMfL Y, ( M , - M) 

The bias vanishes if there is no correlation between the cluster mean Y,. 
and its size M,. Approximately, 

V{y c ) = (1 -/) Sh 2 /n, f= n/N 

I N _ _ 

where S h 2 = —— Y (K, -Y) 2 

N -lfct ' 


1 £(«,• -u)(vj -v) 

/ 

' £(V,--V)2 

«-! 

/ 

n -1 
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1 n _ _ 

with Est .V(y c ) = (1 -/) s h 2 /n, where sb 2 = - - £ (>/■ " ^ ) 

«“M=1 

y tw is an unbiased estimator of K 

V(y„) = (1 -f) S u Vn, 

where 


S u 2 = 


1 * 
—y 


r M£L y 




V Af 




TV 


1 N 

-.Tv,.- 

~ 1 /= i 


U ) 2 


1 ” _ - 

Est. V(jcw) = (1 ~f) Su 2 /n, where s M 2 = —- Y (u L -u) 2 

As evident, y C r is a biased estimator of Y, the bias being zero when the 
relationship between u and v is linear passing through the origin. 

To order 0(1 /h), MSE ( y cr ) = V(y cr ) = (1 In- \IN)Sr 2 , where 
, n _ t n m 2 _ _ 

To the first order of approximation, y cr is more efficient than y c -w if 

p M v > 'A CV(v)/CV(«) 

The regression estimator y crg is also biased, with bias negligible for large 
n and becomes unbiased in the presence of linearity of relation betweenw and v. 
To order 0(l/«), 

MSE (5Ws) = (l -f) SW«, 
where Sr/ = S u 2 (1 - P 2 «v). 

It may be further observed_that y crg is superior to all other estimators, 
asymptotically unless Mi and Yj., are uncorrelated. 

b) Sampling with Probability Proportional to Cluster Size Mi (with 
Replacement) 

An unbiased estimator of population mean Y is given by 

1 n 

Yep ^ y>. > 

*i=\ 


with variance 
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and Est. V(y cp ) = — 1 Z (y, - y ) 2 

c) Stratified Cluster Sampling 

Suppose N clusters are grouped in to k strata, N\ + .. . + N k = N. With usual 
notations, 


1 * 1 n h _ 

JVr(so = 77 = YN h M h — Y y A< - is an unbiased estimator of Y. 
N M h =1 fci 




1 _ 1 

Vn h N h ) 


S Ih 


where 


Ybh ~ 



mi.-Y h ) 2 . 


An unbiased estimator of V(y C ( Sl )) is obtained by substituting s 2 bh for ^bh in the 
variance formula, where 


S 2 bh 


n h 


1 >'1, 

~ 1 /=1 


y„ ) 2 


Comparison of Various Estimators 

Consider a super-population model (Cochran, 1963), where it is assumed 

?,.=-£-+p+ e „ 

Mi 

where E(e ,| M,) = 0, E(e h e, \M h Mj) = 0and V(e,\ M,) = a/MF, 0<g<2. 

Under this model, Sukhatme et al. (1984) have compared the mean square 
errors ofy C w,ycr, ycrcg, yep und yc(si) and have shown that yc(si) with proportional 
allocation and y CKK are equally efficient and more efficient than other 
estimators under comparison. The estimate based on stratified sampling is to be 
preferred in general compared to the regression estimator because it is unbiased 
and moreover can be improved upon by using optimum allocation. However if 
the clusters within a stratum are not of same size, the efficiency of the estimate 
based on the stratified sampling with optimum allocation might fall back 
behind other estimators under comparison. The analysis of data relating to the 
number of villages and area under wheat for each of the 89 administrative areas 
in Hapur sub-division of Meerut district (India) corroborates these conclusions 
(Sukhatme et al., 1984). 
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7.2.3 Estimation of Proportion 

If it is desired to estimate the proportion of elements that fall into a specified 
category C when a sample of n clusters is selected from a population N clusters 
by SRSWOR, let us define Y„ = 1 if/' belongs to class C and - 0 otherwise. 

Hence, 

i N M 

p = — Y V Yij =Y/NM (equal cluster size) 

NMfa fa, 

An unbiased estimate ofp is 

n M V.. i n 

b=y yJjL = iy Pi 

fa fanM nfa 


1 M 

where p, = — ^ Yy defines the proportion belonging to class C in the /th 

M fa 

cluster in the sample. 

In case of unequal cluster sizes, 


N M 


N 

i 

i= 1 


( n M 


p= ZZfy /^M,,and p= LlKy 


i=i 


As first approximation, 


MSE (/>)= = 


1 1 


M 2 n 


V NJ 


N- 1 


where M = EM, / N. 


7.2.4 Estimation of Gain in Efficiency of Cluster Sampling from 
a Sample of Clusters (Equal Cluster Size) 

If the sample is selected as clusters, an unbiased estimate of V(Y) is given by 

V (n=f--“lX(jp£ -y c ) 2 / («-i) 

\n N ) iT\ 

The variance of the estimate of population mean based on an SRSWOR oinM 
elements from a population consisting of NM elements is 


Vy(Y) = 


_1_1 yxjYjj-Y) 2 

V«Af NM) NM-\ 
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Now LL( Y„ - Y) 2 =ZI(Ky - Y L ) 2 + (K, - K ) 2 

(=i 

Both the terms on the right hand side can be estimated unbiasedly from the 
sample of clusters as follows since 

£[SL ( yij - J;,) 2 ] = LI :(Yi, - Y,.) 2 /N 


and 


E 


Z(yi~y c ) 2 /(n-l) 


= 1 (k £ — r ) 2 / (tv -1) 


Thus an unbiased estimator vi(Y) of K|(T) can be obtained from the sample of 
clusters to compare with v(y) to estimate the gain in efficiency. 


7.3 Two Stage Sampling 

As the cluster sampling is in general less efficient than the direct sampling of 
elements because of the tendency of the elements of the same cluster to be alike, 
the efficiency can be increased by selecting more clusters. But this may not be a 
healthy proposition because of the increase in the cost of enumeration of 
elements. To control the cost and simultaneously allowing an increase in the 
number of clusters, it seems natural to select a large number of clusters and sub 
sample them rather than adopting complete enumeration. This results in 
estimating cluster totals from the sub samples. The homogeneity within clusters 
Jeads to smaller variance for the sub-sample estimates even for moderate 
sub-sample sizes. The two-stage, thus, have two components of variation, one 
due to sampling of fsu’s and the other due to sub-sampling of first stage units. In 
two stage sampling the first stage units are clusters of second stage units and to 
obtain a two stage sample we first select a sample (si) of fsu following any 
probability sampling design and next a probability sample £ 2 ) of second stage 
units (which may be either clusters or elements) from each of the selected fsu’s. 
Thus, two stage sampling is a compromise between cluster sampling and direct 
sampling of ultimate units (uni stage sampling) from the population. If the 
ultimate units are obtained atr (> 2) stages, the sample is called a multi-stage 
sample. 

Multi-stage sampling has been found to be useful in many current large 
scale sample surveys in many countries of the world. Mahalanobis (1940) used 
this sampling procedure in crop surveys carried out in the then Bengal of India 
between 1937 and 1944 and termed this procedure as nested sampling. 
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7.3.1 Formulation of Problem and Generalized 
Horvitz-Thompson Estimator 

Let U denote the Finite population of NM elements ( 1 , . . ., NM). Let NM 
elements be grouped into N clusters (fsu) U c = (W, . . ., Urf). Uf has A/, 
elements. First, a probability samplesi of size n fsu is selected according to any 
sampling design p(s\). Then, for every / e s\> a sample sn of size w, elements of 
ssu’s is drawn from Uf (52/ c Uf) according to design p(s2i\s\). Usually same 
sampling design p(si | si) is adopted for selecting second stage sample from 
each fsu in s\. The sample of second stage units constitutes 


S2~ U S 2 i 

ies\ 

Further, for every fsu in si, P r V S 2l IS, =U P r fj 2 ,1 s, 1. 

L ies i J ieSf -* 

This implies that sub-sampling in a given fsu is independent of the 
sub-sampling in any othery&w. 

Let Yy be the value of the characteristic y correspondingyth ssu of the /th 
fsu in the population with / = 1,..., N and j — 1,..., Mi. Let yy be the value of the 
yth ssu in the ithfsu in the sample; / = 1 ,.. ., n and j= 1 , ..., m t . 

The population total in the /th fsu or psu is denoted by 



N 

and total of the characteristics of the entire population Y- ^ Y t . The popula- 

i=i 


tion mean 


/v / /v /v _ 


/ N 

2>« 


Inclusion Probabilities 

The inclusion probability of^: th element of the population in 52 is defined as 71 * 
= n\j n/ci if k, Uf and n/c/i is the conditional inclusion probability. 

Second order inclusion probabilities are 

n u n k /i if k = leUf 

ti k \ — * KxiTtkVi if k,l eUf,k*l 

nujKkiiKiij if k e Uf and / e U c (i*j) 

k J 

Suppose we condition on the event that the first stage samples 1 has been 
realized. 




Cluster and multi-stage sampling | 181 


Define th, = jWrc* /• 

Then HT estimator of /th fsu total is 


y l = Z -^- = Z t kn 

k esu n ku k es 2 i 

It may be verified that in repeated sampling of /th fsu U, c according to an 
arbitrary sampling design p(sn), Yj., is an unbiased estimate off/.. 

a 

The variance off,, is 

V(Y i ) = m U r Du/jticiitm, 


where Du , = 7t*, 71/, - nu ,. 

Horvitz-Thompson estimator of the population total f is 

A ' 

- -A f 
Yht= £ — 

fel "li 


Expectation and Variance in Two-Stage Sampling 


The procedure of finding expectation of an estimate based on two stage 
sampling is first to average out the estimate over all second stage selections that 
can be drawn from a fixed set of/7 fsu’s and then get an average over all possible 

A A A 

selections of n fsu’s. Thus, for an estimate© we have £(0) = E\ [£ 2 ( 8 )], where E 
stands for expectation over all samples ,£2 denotes conditional expectation over 
all possible second stage selections for a given set of fsu’s and£i denotes 

A 

averaging over all first stage selections. To find V(6), the formula may be 
derived as 

m= V\E 2 (0) + £1 Vi ( 6 ), 

A 

where V\ denotes variance of the conditional expectation of0 and Vi is the 
conditional variance over all possible second stage selections over a fixed set of 
fsu’s. 

Similarly for a/--stage design, 

£(0) = £1 £2 ... £r (0) 

and K( 0 )=fi£ 2 ...£r( 0 ) + £if2£3...£r<0) + ... +£i£ 2 ... £r-1 V r ( 0 ). 
Thus, £(f hi) = £i£2 (Yhi) 


£j £ 2 


= £, 

Z— E 2 {f L ) 


l s > li 


_s, n u 




= f 
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as 


E 2 (Y i ) = E 1 Z — = Yi 

s, U £// 


V{Y m ) = V\E 2 ( Y h r) + E\ V 2 ( Y m ) 

n 


r V ^ ^ v ^ 


“HA ij 

U , 


\K U J 


n 

Ui jj 


nYi.)_ v , v 

+ 2--F FSt/ + K SS(/ > 

7t 1 i 


where 


and 


jDl// — 711/711 / — Tt\ij 


V(Y l )- Z 'LD kl/i t k i i t l i i 

s 2i 

Therefore, V{Y hi) constitutes two components of variation: the first component 
VfSO is a measure of the variation between fsu’s and the second is a measure of 
variation between ssu’s within fsu’s. 

A 

An unbiased estimate of V(Yht) may be written as 


v{Yht )- Z Z 

1 </es 1 


n 1 i K \j -7l l ij 


71 


1 ij 



1 

■*<> 

i- 

i 


£ 

_j 

—i 


+ Z 


v(T/.) 


ies, n u 


7.3.2 Two Stage Sampling with SRSWOR 

Suppose the first stage sample s\ is a SRSWOR sample of size n. In the second 
state a SRSWOR sample sn of size /w, ssu’s is selected from the z'th selected fsu 
having Mi ssu’s. An unbiased estimator of Tis given by 

//y 

n hx 


. yv « M.- A N sr aa - 
n nti / = 1 


E(Y)=E ] E 2 (Y)=E ] 


(// « ^ 
-TYi 

v n /=! y 


= Y 


where Y,-, is the total of the rth fsu in the sample. 
V(Y) = V] E 2 (Y) + Ei V 2 (Y) 


■= V , 


fyy ”, ^ 

-Yy l 

+ 

—i 

V ” i=i J 

L 


M2 h £? 

n 2 SI 


_ S? «» 5? 

=Af 2 w 2 (i-/)-^-+-VM 2 (i-/;)- 2 -, 

« n iZ i m, 



\ 

\ 
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where/= n/N,f = m/M h M= IM//V, 

Sh 2 = [1 KN-\)]L[(Mi/M)Yi-Y] 2 , 

and Si 1 = 


—!— UYij- Y,.f with Y, —-I Yt and Y = ZZY tJ IZM,. 

Mi j 


A. 

An unbiased estimator v(Y) of V(Y) can be obtained by substituting estimates of 
Sh 2 and S, 2 in the variance expression. Now, we have 

Si 1 = s, 2 = [l/(m, - 1)] liyu-y,) 2 

S h 2 = Est. [1/(JV- \)Af] [IMi 2 (y,. 2 - WN] 

= [i i(N- l/w 2 ] [Est m, 2 Yi 2 - Est. y 2 /^], 


where Est. EM; 2 T, 2 = tIX 2 

n i=\ 


5 2 


?? -Cl-// )— 

mi 


and Est. y2 = y 2 -v(r). 
On relevant substitutions, we have 


c 2 - 


N 


n(N - 1) 


M 2 




(n- 1W -2 =f-(l- /,)-*- 


Thus, 


M 


N ^ 




v(Y) 


N(N-\)M : 


sf 


v(Y)=N 2 M 2 (1-/)—+ — V M? (1-fi) ” 

n n *“ 


m, 


Remarks 

1. For n = A, two stage sampling reduces to stratified random sampling with 
N strata. 

2. For mi = Mi, two stage sampling corresponds to cluster sampling. 

3. For M\ = M 2 = ... = Mn - M and mi = m 2 = ... - = m, an unbiased 

estimator of population total is 

f = /V Afy, where y = (1 /nm) ZZyij 

V(Y) = (. N 2 M 2 /n ) (1 - f)S b „ 2 + (N 2 M 2 /nm) (1 -fi)S 2 W0 

where S 2 h » = [\/(N- 1)] 1 (Yi -Y) 2 

and S 2 wo = 1 IN IS 2 , = [1 /N(M- 1)] II (. Yy-Yi ) 2 , 

__ 1 M _ 

Yi =-TYij and Y= [\INM] II Y 0 , 

M /= , 


/= n/7V, /1 = m/M. 
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An unbiased estimator of V(Y ) is 

v(T) = (rfMVn) (1 -f)s 2 bo + (N 2 M 2 /nm)f(\ -f\)s 2 w0 
where s 2 h<> = [l/(« - 1)] I (yi. -y) 2 and 
s 2 W0 = [1/w] Is/ 2 = [ \/n(m - 1)] IE (yj - yi--) 2 . 


1 m 

with yi=-Y yij,y.. = 

m P I 


— II ytj , E{s 2 0 )=S 2 0 
nm 


and E(s 2 h<>) = S 2 bo + (1/m - \/M)S 2 w <,- 


Alternative Estimators for Two-stage Sampling with Unequal First 
Stage Units: Biased Estimators 

\)Y x =NM(\ln)Ty L 


- N 2 M 2 

MSE y, = ■ (1-/) 

n 


1 n(_ 1 

— x \ y l —-1 J'f 

N -1 V N 


2 NM 2 ^ S 2 

+ —S(!-//)-*- 

n tr{ m, 


A 

r n In \ 

( N \ 

ii) Y r = 

Z M i y i / £ 



\ J 

V/= 1 ) 


which is the ratio to size estimator. 


N 2 M 2 1 

MSE(y r )=-——— (1-/) 


n 


N -1 



( Yl-Y ) 2 


aw 2 

+- 

n 



(i -ft) 



Remarks: Alternatively, one may also construct a regression estimator for the 
purpose. For details about these biased estimators the reader may refer to 
Sukhatme et al (1984). 

7.3.3 Stratified Two Stage Sampling 

The population is first divided into strata and from each stratum a two stage 
sample is selected. Stratified multistage sampling is a common design in crop 
surveys and in many rounds of National Sample Surveys conducted in India. 

L 

Let the population be divided into I strata of sizes N \,..., Nl..N= ^N k . 

k= 1 

Let Mk, the number of second stage units in the /th first stage unit of the k th 
stratum, i = 1,.. ., /V*. 

— Nk 

NkM k total no - of second stage units in the Ath stratum. 
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Yiaj = value of the characteristic on theyth second stage unit in /th 
first stage unit of the £th stratum. 


Yki. = 


1 


KM ki 


M ki 

Z Y*U 

7 = 1 


1 N,M h _ L 

Yk. = 77 t j ? ? ^'7 ’ ^ = Z ^ ^ ’ 
N k M k > > k = ] 


where = Z 7 Z Z - 

f * • 

Let n k be the number of fsu in the sample from^th stratum, n =Zn k . Let m kt be 
the number of ssu in the sample from the/th selected fsu of the /cth stratum. 
Define 


| "‘ki 

yki. ^ \ ykij •> yk ^ — 

m»- • , Mi. : = .M 


7 = 1 


— 2* n~y*i 
n k j = \M k 


Now, an unbiased estimate of the population mean in stratified two stage 
sampling is given by 

y=2>*5t- 

k = 1 



\n k 


N k J 



1 





(Mki')* 

f 1 _ 

- 1 Is 2 

\M k ) 

V m ki 

M ki ) “ 


which can be estimated unbiasedly by estimating S 2 kb and S 2 ki from each 
stratum. 


7.3.4 Two Stage Sampling-First Stage Sampling with PPSWR 
and Second Stage Sampling with SRSWOR 


The efficiency of two stage sampling may be increased by adopting unequal 
probability of selection either at the first stage or at the second stage or at both 
stages. 

Consider a two stage sample where the fsu’s are selected with probability 

N 

proportional to size z, with replacement. Let p, = z t / ^ z, be the probability of 

i=\ 

selecting /th fsu of the population. From the/th fsu selected, a sub sample ofm, 
ssu’s is selected according to SRSWOR. An unbiased estimate of the 
population total is 



My, 

Pi 
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with 


nYpps) Y 


N fv V 1^,(7? 


Yl 


+ -E— ■ 

« ;-i Pi 


where or = Mr (1 -f) Si 2 /m,. 

Since the fsu’s are selected according to a probability sampling with 
replacement, each of the n estimates Mty //?/ of first stage population means 
provides independent unbiased estimate of Y. Therefore, an unbiased variance 
estimator of the combined estimate/^ is 

1 JL ( \A ~t. A 2 1 JL (Y- -V 

v(Ypps) = 


1 f 

r Mty L f 

2 1 fl 

- 1 T 

c - \ 

Y. 

—L - Y 


I 

V Pi ) 


l Pi ) 


Remark 1. For/?, = MitYLMi = MJM 0 




pps 


n i=\ 


This estimate becomes self-weighting when/w/ = m, so that 

Y pps =M 0 — ZlPj,- =M„y. 
nm i j 

Remark 2. In PPSWR a fsu may enter more than once in the sample. Therefore, 
in stead of sub sampling a fsu independently every time, it would be cheaper to 
sub sample it once and weigh it by the frequency with which it occurs in the 
sample. If X, is the number of times U, occurs in the sample, an estimator of 
population total Y may be proposed as 

1 N y 

?i.=-1* A 

n U Pi 

where E(X,) = np„ V(Xi) = np, (1- pi) and cov (X„ Xj) = -npiPj- 

It may be verified that E(Y* ppx ) = Y and V(Y * pps ) = (1 In) I <5, 2 /pi + [( n - 
1 )/n ] Zct/ 2 + \/n T,p, ( Yj/pt - Y) 2 . where a, 2 = K(T/,) = conditional variance of the 
estimate of / th fsu total. Thus, 

(«-l) 


V(Y* PP s) - V(Yp p ?) = 


n 


Ecr, 2 . 


Alternatively, if the /th fsu is selected X, times, one may draw a single 
sub-sample of X, m, units without replacement from Mi ssu’s, provided M, > 
X,m,. Sukhatme (1953) has shown that with this methodf(y w . v ) is reduced by (« 
-l)LMS, 2 /«. 
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7.3.5 


Two Stage Sampling with PPSWOR in the First Stage 
and SRSWOR in the Second Stage 

A 


~ n 

YPPSWOR - X! 

1=1 


Mh 

*\i 



x±_ 

Kli 


with variance 


N N 


f 


V{YpPSWOR ) _ X X (^1 i n li ~ n \i.i ) 
1 = 1 /■>/ 


Yl 




./• 


+ 




Est. V(Yppswor )- X Z 

1=1 />1 


_ ^ ^ ^1/^1/ ^ 1 ij 


Ui / *1 jj 

( 


m.n 


i 11 1 / 


71 


l u 


i= 1 

Y^_Yj_) 2 + f 


\Tl\i Tl i ; J 


i = 1 


W/7t 1( 


7.3.6 Comparison of Two Stage with Unistage Element 
Sampling 

The sampling designs under consideration are 

i) SRSWOR with nm elements in a single stage. 

ii) Two stage sampling design-select/7 fsu’s in the first stage with SRSWOR 
and select m ssu’s from each selected fsu having size M according to 
SRSWOR. 

The variance of sample mean based on SRSWOR size/7/w selected from NM 
elements is given by 

V(y n m) = {Mnm - 1 /NM) S 2 = SP/nm. 

Further, the variance of the estimate of population mean based on nm elements 
selected in two stages with equal first stage units is approximately given by 

V(Y)two stage =- [1+8 (/W — 1)], 

nm 

where 8 is the coefficient of homogeneity, defined by 8 = 1 - S W 2 /SP. The 
relative efficiency RE ( Y) lw( , sla R(f = l/{ 1 + 8 {m - 1)}. 

7.3.7 Comparison of Two Stage Sampling with Unistage 
Cluster Sampling 

Appropriate unistage cluster sampling for comparison is one in which an srs of 
(nm)/M clusters is selected for the complete enumeration of all elements in the 
clusters. Thus, for such a cluster sampling 

f m n 
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(l 

n 


1 , 

f 1 


- 


Si 

+— 

— 



n) 

b 

n ' 

<m 

m) 


•2 

w 


NOW, V(y c ) — V(Y)i W0 stage = ~ 

n 


s 2 h ~— 

b M 


'IL-i 

. m j 


Thus, two stage sampling will be more efficient than cluster sampling ifSh 2 
> S w 2 /M. For Sh 2 < S w 2 /M, the sub-sampling of clusters will result in loss of 
efficiency. 


7.3.8 Variance and Cost Functions 


As already known, in two stage sampling the variance of the estimators of the 
population mean or total constitute two components of variation - ‘Between 
fsu’ variance and ‘Within fsu’ variance. For instance, in two stage sampling 
with first and second stage samples being n and m respectively the variance of 
the estimate of population mean is of the form (\/ri) (Ai + Ailm ) + A 3, where A3 
becomes zero if the fsu’s are selected with replacement. The increase of/? 
reduces both the first and second components whereas the increase of/w only 
reduces the second component. Murthy (1967) has shown through empirical 
studies that the gain in efficiency in increasing m is substantial for initial 
increase in m , but happens to be marginal after a certain stage. Thus, it is 
desirable to increase n for bringing about a substantial reduction in the 
sampling variance, provided the cost is not prohibitive to achieve increased 
gain in efficiency. 

The optimum values oim and n can be obtained by choosing a suitable 
cost function. Assuming travel costs between fsu’s to be unimportant, the cost 
function may be suggested asC = C\ n + Ci nm. 

The first component c\n is proportional to number of primaiy or first stage 
units and the second component Cjnm is proportional to the total number of 

A 

ssu’s. The variance off in two stage sampling may be written as 


V = 


n 


NJ 


Sl + 



1 

_ 1) 

m 

M) 



\ 

n 


c 2 

SI- — 
b M 





The last term in Kis independent of/? and m. Thus, minimizing V for fixed Cor 
minimizing C for fixed Fis equivalent to minimizing the product 


\v+ 1 ' 



sil 

NS\ _ 


{ b M ) 

M 


(C, + C 2 m) 


By the application of Cauchy-Schwartz inequality 
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subject to Sb 2 > S w 2 /M 


Remarks: 1. m„ p , is rounded to nearest integer. 

2. If m 0 pt > M or S b 2 < S w 2 /M , put m = M. 

The optimum value of n is obtained by solving either the cost equation or the 

variance equation, which ever is fixed in advance. 

With the cost of travel between first stage units (primary units) becomes 
substantial, it may be more appropriate to use the cost function 

C = C\ n + C\' 'In + Ci nm, the travel cost CY'In is assumed to be 

proportional to 'In. 

Hansen, Hurwitz and Madow (1953) have given a method for determining 
the pair ( m,n ) that minimizes the variance. For more detailed discussion on 
estimation of optimum sub-sampling numbers when unequal number of ssu s 
are selected from the selected fsu’s either following SRSWOR or varying 
probability schemes, the reader may refer to Cochran (1977), Sukhatme et al. 
(1984), Rangarajan (1957) and J.N.K.. Rao (1961, 62). 


7.3.9 Unbiased Variance Estimation for Multi-stage 
Designs - A Generalized Method 

N 

A general linear unbiased estimator of the population total Y— ^ T, is given by 

»—1 


N 


Y — L a is x Yi. - ^ a iSi Yi. ■> 

/es, /=! 

the are real numbers such that for each si, 


WJ, 


a is\ ~ a is\ i e 51 

= 0, otherwise 


E(Y)=E ] E 1 







) 


N 


= £| I a), Y,. = X ( a k ) Y ‘ 


The condition for unbiasedness is 

(<4, )*M =1 -•••>*; 
v(Y)=y,E2(Y)+E,y 2 (y ) 
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=v. 


N 


, i Y i- 


i = l 


+ £| ^2 a /J| ] 


=K(y < .)+X£,(a;, 2 )K 2 (r i ) 

/—I 

= K(y f ) + f £,(a 2 )^,, 

1 = 1 

where ^ = L 2 (f,) is the conditional variance off/., for a given si and Y c = 

N 

X > which is the unbiased estimator of Y for unistage cluster sampling. 

/= i 

A 

Let us assume that an unbiased estimator of V(Y C ) exists. Thus, it must 
necessarily be of the form 

v(Yc)=f(V)= I b is J?+ ZZ b iks J L Y k . 

ies] i*kes\ 


The real numbers 6 (S| and b iksi are predetermined for every s\. We have, 
therefore, E\[f{Y)\ = V(Y C ) or, 





’ N 

Ex 

=v 

Z< y ' 


ies } i*kes\ 


J=] 


or, 

(%,)>? + ^Edb'^yyjj. 

i = 1 i*k es\ 

= i>(<4,) K, 2 + 11 cov (ai, at,) K,. K* 

/= 1 


where (i) b' iS] = b iS] if / € Jiand = 0 otherwise, (ii) b' ikS] = , if (i,k) e, si and = 0 

otherwise. The conditions for unbiasedness are 

E \ (b' iS] ) = V(a' iSx ),£, (b' iks] ) = cov (af Sl ,a* Sl ), i * k = 1,. .., N. 
Suppose for a given s\, an unbiased estimate V iS] of F^is available. The 
E iWis, ] =^«|, i e ji. 

If = P/ where L, is constant for each si, an unbiased estimate of V(Y) is 
given by v(f) =fly) + Z a? V s where £ 2 (f/) = V,. 

1 ES\ 1 

This formula is due to Raj (1966, 68) and leads to an important rule ‘Get 
an unbiased estimator of the variance in unistage cluster sampling. Obtain a 
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copy of it by substituting Y in place of K, : Also, get a copy of estimator of Kin 
unistage cluster sampling by substituting^. for Y h The sum of these two copies 
is an unbiased estimator of variance in the multistage case’ (J.N.K. Rao, 1975). 
Durbin (195 j) has given a similar rule for finding the estimate of variance in 
Horvitz-Thompson estimator of population total in multistage design. 

The estimate of variance in multistage sampling is the sum of two parts. 
The first part is equal to the estimate of variance calculated on the assumption 
that the first stage values have been measured without error. The second part is 
calculated as if the first stage units selected were fixed strata, the contribution 
from each first stage unit being multiplied by the probability of that unit’s 
inclusion in the sample. 

Although Raj’s formula is valid for many multistage designs, Rao (1975) 
has given some examples whereK^ = V { is violated, such as when n fsu’s are 
selected by pps sampling with replacement and [iithfsu is selected X, times {X, 
> 0, T.X, = n), a sample of Xum ssu’s is drawn by SRSWOR from it, assuming 
nm, < M, for all /, where M, = number of ssu’s in the ith fsu (Sukhatme and 

Sukhatme, 1970). Under such circumstances an unbiased estimator ofF(K) as 
suggested by Rao (1975) is 

v(n=/(n=/(n+ ik A 

ieis\ 1 

Srinath and Hidiroglou (1980) have suggested a novel procedure to draw a 
WOR sub-sample of m/ from m, ssu’s and to form an estimate of variance as 

71 1/ 71 !,/ ~ n Uj ] [ Tj T'j l 2 

71 1 ij _ 71 1 i 71 1 j 

where T! = My\- and/,, is the sample mean based onm', ssu’s. The value of m', 
which achieves unbiasedness is 

w _ m,(l-n u ) 

l-7t„(w, /A/, )■ 

This seems an interesting piece of work in the sense that while it is not 

possible to obtain an unbiased estimator of the total variance based on sampled 

ssu s, it is possible to obtain it using/™ totals based on a sub-Sample of sampled 
ssu s. 



NUMERICAL EXAMPLES 

S',!' 11 A Pil °* SamP ' e SUrV6y f0r the Stud y cultivation practices and 
yield of guava was conducted by Indian Council of Agricultural Research in 
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Allahbad district of Uttar Pradesh (India). Out of total of400 bearing trees, in 
Umerpur. Neera Village 10 clusters of size 4 trees were selected and their yield 
records (kilogram) are given below: 


Table 7 E.1 


Cluster No. 

1st tree 

2nd tree 

3rd tree 

4th tree 

1 

5.53 

4.84 

0.69 

15.79 

2 

26.11 

10.93 

19.08 

11.18 

3 

11.08 

0.65 

4.21 

7.56 

4 

12.66 

32.52 

16.92 

37.02 

5 

0.87 

3.56 

4.81 

27.54 

6 

6.40 

11.68 

40.05 

5.15 

7 

54.21 

34.63 

52.55 

37.96 

8 

1.94 

35.97 

29.54 

25.98 

9 

37.04 

47.07 

16.94 

28.11 

10 

56.92 

17.69 

26.24 

6.77 


i) 

ii) 


Estimate the average yield (kilogram) per tree of guava in the 
Umerpur-Meera Village of Allahabad, U.P. along with its standard error. 
Estimate the efficiency of cluster sampling compared to simple random 
sampling and the coefficient of homogeneity. 


Solution : An estimate of average yield per tree of guava is given by 

Z I>^=^= 2 0.44 k g. 
nM ft 1 n 10 


— ( 1 111 n — — 
Est. K(K C )= —t(Y, -Y c ) 

\n NJ n-lft 


f \_ _ 1 _" 

ao 400/ 


(1386.3071) _ 13.8631 (kg) 2 
19 


a 

Est. S.E. (Fc) = 3.72 kg. 

A 

Now, 5b 2 = [M(n - 1)] !(?/- Fc) 2 = 154.034 


M 


and 


Sw 2 = 


n(M - 1) ft 


E I (Yij - Y,: ) 2 = 162.923 


An estimate of efficiency of cluster sampling as compared with simple random 
sampling is 
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(N-.l)Ms 2 b +N(M-\)sl 
~ (NM-\)Ms 2 b 

- {396(154.034) +300(162923)} /{(1596)(154.034)} = 0.447 
An estimate of coefficient of homogeneity 


8 = 1 - 



= 1 - 


162923 

275.374 


= 0.408 


Example 7.2: In an experimental investigation 100 fields, each consisting of 16 
plots of equal size, were sown with wheat. Out of 100 fields, 10 fields were 
selected by SRS WOR and out of each field so selected 4 plots were selected by 
SRSWOR to observe the yield (kg). 

From the given observations following values are calculated: Sample 
mean yield (in kg) for the selected fields are: 4.290,4.255, 3.795,4.220,4.070, 
3.636, 4.550, 4.285, 4.375, 3.790. 

s h 2 = 0.1642 (kg) 2 , 5 W 2 = 0.3215 (kg) 2 

Estimate the total yield of wheat in the experimental station along with its 
standard error. 

Compare the efficiency of this estimate with one that would have been 
obtained by selecting a SRSWOR of 40 plots out of 1600 plots in the station. 

Solution 


=fl_ n 

2 1 
-i_ 

f \ r 

\n N) 

b N 

\m M; 


N - 100, M= 16, n = 10 and m = 4. 

An unbiased estimate of the average wheat yield is computed as 

Y = - V y, ) = (41.265)/10 = 4.1265 kg. 

nfZ1 

Estimate of total yield = Y = NMY = 1600 x 4.1265 = 6602.4 kg. 

£.2 
MV 

= (1/10 - 1/100X0.1642) + 1/100 (1/4 - 1/16) (0.3215) 
= 0.014778 + 0.0006028 = 0.0154 (kg) 2 

■A. 

Est. S.E. (F) = 0.1240 kg. 

Hence, 

Est. S.E. (JO = 1600 x 0.1240 = 198.40 kg 
In simple random sampling, an estimate of variance is given by 
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As 


f 


Est. V(Ysiis)- 


1 


1 


A 


\nm NM) 

The estimate of S’ from a two stage sampling design can be obtained as 

(N +N(M -l)S, 

NM-\~ 


2 

\v 


S 2 =■ 


C2 = c.2 _ 

J b 


1 -ml M 


m 


>s'i 


and S Sw i 
we have, 

1 


5 2 = 


NM-l 


(N -1 )Ms 2 h - <N(M - 1) + 


1) 


m 


>5 


W 


1 

99 x 16x 0.1642+ \ 

100x15- 

12x99] 

A i 

>x 0.3215 

(1600-1) 

1 

l 

4 J 

- 


= 0.4045 


Therefore, Est. V(Y)sns — (1/40 — 1/1600) (0.4045) 0.0099 (kg) 

Relative efficiency = 0.0099/0.0154 = 0.6429. 

Example 7.3: For studying the feeding and rearing practices of sheep and yield 
of wool in Ajmer division of Rajasthan State of India during 1980-81, a two 
stage sampling design was adopted with tehsils as first stage units and villages 
in the tehsil as second stage units. Table 7E.2 gives stationary sheep population 
in the selected villages in each of the four tehsils selected from 12 tehsils of 
Ajmer division (Singh and Chaudhary, 1986). 

Estimate the mean stationary sheep population in the _Ajmer division 
during the year 1960-61 together with its standard error, when M = 124, by (i) 
biased method of estimation (ii) unbiased method of estimation and (iii) ratio 
method of estimation. 


Table 7 E.2 Sheep population in selected villages. 


Selected No. of Stationary sheep population in the 

Tehsil villages selected villages 


Behrar 102 266, 890, 311,46, 174, 31, 17, 186, 224, 

31, 102, 46, 31, 109, 275, 128, 125, 267’, 
153,152, 84,21,52,10, 0, 48, 94,123, 

87, 89,109, 0,310, 3. 

129, 57, 64, 11, 163, 77, 278, 50, 26, 127, 

. +****' r - ** 


Bairath 


105 
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25, 194, 350, 0, 572, 149, 275, 114, 387, 

53, 34, 150, 224, 185, 157, 244, 466, 

203, 354, 816, 242, 140, 66, 590, 747, 147. 

Ajmer 200 247, 622, 225, 278, 181, 984, 132, 659, 

403, 281, 236, 595, 265, 431, 190, 348, 

232, 88, 1165, 831, 120, 887, 938, 197, 

614, 187, 869, 330, 485, 60, 60, 1051, 
651,552, 968. 

Bansur 88 347, 362, 34, 11, 133, 36, 34, 61,249, 170, 

112, 42, 161,75, 68, 0, 247, 186, 473, 0, 143, 
198, 65, 0, 308, 122, 345, 0, 223, 302, 219, 

_ 120,199,35,0,0. 


Solution 


n = 4, Mi = 102, Mi = 105, M 3 = 200, M 4 = 88, M = 124 
m\ - 34, mi = 36, mj, = 35, m 4 = 36 
y\- = 135, = 225, yi- = 471, y*. = 141 

i) Biased estimate 

Y i = —L y r = 972/4 = 243 
n 


Est. V(Y ,) = 4214, 

Est. S.E. (Yi) = ^4214 = 65. 

ii) Unbiased estimate 



= (102 x 135 + 105 x 225 + 200 x 471 + 88 x 141)/(4 x 124) 
= 144003/496 = 290 


Est. K(y 2 ) = 9213. 

Est. S.E. (F 2 ) = ^9213 = 96 

iii) Ratio estimate 


±Mryi 


.-7 1 = 1 


Y = 


144003 


= 291 


n 


495 
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Est. y(V 3 ) = 6555 
Est. S.E. (X 3 ) = V6 555 = 81. 


EXERCISES 

7.1 In a two stage sampling one ssu is selected with pp to* from the entire 
population. If this happens to come from the/th fsu, a without replacement 
random sample of nv — 1 ssu is taken from the remaining A/, — 1 ssu s. 
From the other (N- 1) fsu’s a without replacement random sample of (p - 
1) fsu’s is taken. The selected fsu’s are sub-sampled following SRSWOR. 
Show thatZMiyi/LMiX, is an unbiased estimate of R = Y/X. 

12 A population consists of N fsu’s and each fsu consists of M ssu’s. An 
SRSWOR of n fsu’s and in each selected/yw, an SRSWOR of m ssu’s are 
selected to estimate proportion of certain attribute in the population. If/?, 
is the sample proportion of attribute in the /th fsu , show that 


7.3 


is an unbiased estimator of the population proportion/ 5 . Obtain variance 
of the estimator and its estimate. 

An SRSWOR of n clusters are selected from a population of N clusters 
each of size M. Let A.,- be the proportion of units in /' th cluster possessing 
certain attribute, / = 1,.. N. Let X be corresponding proportion for the 
whole population. 


~ 1 n 

Prove that X = — V X ,• 
«/= 1 


is an unbiased estimate of X with 


V(X) = 


N-n A(l-A.) 
N -1 n 



Mi-*/) 

Ml-A.) 


7.4 A finite population is composed of N clusters, /th cluster having M, 
elements. A simple random sample of m elements is drawn from 

N 

A/, =M 0 elements and then grouped into clusters to which they 
/=1 


belong. Suppose m elements fall into n clusters. Show that Y = — V —, 

m^n 

where y v is the sample total of/th cluster, is an unbiased estimator off and 
derive its sampling variance. (GlTosh, 1963) 

7.5 In two stage sampling, n fsu’s are selected with ppswr. Consider the 
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following sampling procedures in the second stage if/th fsu occursr, 
times in the sample 

i) run, ssu’s are selected with SRSWOR 

ii) n independent sub-samples of/w, ssu’s are selected. 

iii) m, units (ssu) are selected without replacement and observations 
are weighted by /*,. 

Otain unbiased estimators of population total for all three procedures. 
If V\, Vi and V3 are their sampling variances, show that for the saine 
expected sample size V\ < Vi < V3. (Rao, 1961). 

7.6 A population consists of N clusters, the /th cluster having M, 

_ 1 N 

elements. M = — Z M /. n clusters are selected from N clusters using 

SRSWOR. Show that Y = — V y- { is a biased estimator Y, where y l y is 

1 ' 

the mean of the /th cluster in the sample. Obtain an unbiased estimator 
of bias and hence obtain an unbiased estimator of Y. 

7.7 Suppose a population consists of N clusters having M units each. 
Selecting n clusters systematically derive the sampling variance of 
the estimator of population mean per unit. (Madow, 1949). 

7.8 Suppose NM units of a population is grouped at random to from N 
clusters having M units each. Show that an SRSWOR of n clusters 
would have the same efficiency as that of SRSWOR of nM units. 

7.9 For estimating the total yield of paddy ( Y) in a district a stratified two 
stage sampling design with three strata was adopted. From each 
stratum four villages were selected with PPSWR, size being the 
geographical area and 4 plots were drawn following SRSWOR from 
each sampled village to observe the yield of paddy. Using the data in 

Tale 7.1 estimate unbiasedly Y and obtain an estimate of its standard 
error. 


Table 7.1. Yield of paddy for the sample plots. 


Stratum Sample Inverse of 

village probability 


1 1 435.24 

2 628.43 

3 35.58 

4 98.25 

1 25.00 

2 18.80 

3 22.75 

4 45.25 


Total Yield of paddy (kg) 
No. of 


plots 

1 

2 

3 

4 

25 

102 

162 

150 

92 

10 

98 

72 

134 

142 

25 

112 

115 

45 

165 

70 

342 

315 

140 

115 

270 

125 

112 

127 

215 

300 

118 

160 

105 

128 

250 

252 

65 

135 

45 

75 

315 

115 

58 

105 


2 
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1 

67.16 

125 

125 

-254 

110 

112 

2 

298.25 

34 

85 

58 

115 

118 

3 

89.00 

125 

115 

200 

169 

105 

4 

72.15 

170 

235 

115 

290 

124 


7.10 In a SRSWOR n first stage units are selected from N first stage units. In 
the ith first stage unit having Mi second stage units a SRSWOR sample of 
mi units are selected. Information is collected on y and x for each selected 
unit. 


Define x = — V A//*,- and Y = — V Mi y% 
n n 

Find cov (X,Y) 

7.11 From a population ofN clusters with known measure of si xXi (i= 1,2,..., 
N). Find the expected value of the variance of the estimator t = Zy,/ 7 i, 
under the super population model: 

E m (Xi I X i )=aX i ;V m (Y t \X t ) = a*Xf , 
co v m (Yi,Yj | X if Xj)= 0 (i*j) 

Find the condition that sampling with7i/ oc X? is better than samaapling 
with it; oc X i. 


8. USE OF AUXILIARY INFORMATION 
(METHODS OF ESTIMATION) 


8.1 Introduction 

In finite population sampling the foremost Ointerest of the sampler is to provide 
a reasonably good estimator of the population parametric function in question. 
Towards this end in view, he not only uses a suitable sampling design at the 
selection stage, but also takes into account all available auxiliary information 
pertaining to the study variable during the collection of data to improve the 
quality of the estimate. In case auxiliary information or in other words 
information on auxiliary variables is available while planning the survey, this 
may be fruitfully utilised both at the design and estimation stage. As the name 
suggests auxiliary information, in general, should have some influence on the 
survey or study variable to bring about a possible reduction in the sampling 
error of the estiamate. If y and x are the study variable and related auxiliary 
variable respectively, one may postulate a functional formy = J{x) between 
them and determine it approximately from a given probability sample(s) and 
finally use it at the estimation stage to change the form of the estimator in order 
to improve its efficiency. The underlying assumption behind the use of 
auxiliary variables is that they covary with the survey variable and thus carry 
some information about it. For instance, to estimate total area under wheat in a 
region, where sampling units are villages, total cultivated area of a village may 
be taken as an auxiliary variable. Similarly, for estimating cattle population in a 
region, previous census figure on cattle population for each village may provide 
suitable auxiliary information. If we areintersted in estimating total number of 
guava trees in guava orchards, it may be worthwhile to record the area under 
guava orchards in the sample along with the number of guava trees for sample 
orchards. In repeated surveys information collected on the study variable in the 
previous period may be treated as auxiliary information for the study variable in 
the current period. Watson (1937) used leaf weight (x) as an auxiliary 
information and used regression of leaf area (y ) on leaf weight to estimate the 
average area of the leaves on a plant. Cochran (1977) suggested that a rat expert 
might make a quick eye estimate (x) of number of rats in each block in a city 
area and then determine the number of rats (y) in each block in the sample by 
actual trapping to arrive at an improved estimator of the number of rats in the 
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city. Yates (1960) has suggested that eye estimate of the volume of timber fr) 
may be fruitfully exploited as an auxiliary variable to estimate volume of 
timber in a forest area. 

In certain situations when information on a number of auxiliary variables 
is available, a survey statistician should take a judicious decision as to its 
efficient use whether at the selection stage or at the estimation or at both the 
stages. To arrive at such a decision one needs to make certain assumptions as to 
the variables and their interrelations on the basis of suitable models which may 
be approximated by empirical studies. 


8.2. Ratio Method of Estimation 


8.2.1 Introduction 


Let there be N units in the population listed as Ui, Ui ,..., Un. To each unit £/, (/' 
= /',..., AO a pair of values (T,, X), assumed by real variables ( y,x ), is associated. 
Assume thatX, > 0 for all /' and furthery and x are positively correlated. T, andX 
correspond to measurements on the /'th unit pertaining toy and x, called the 
study characteristic _(variable) and the auxiliary characteristic (variable) 
respectively. Define (Y,X) and {Y, X) as the populationjneans and totals of { y, x ) 
respectively. Further, define the population ratio/? = Y/X or Y/X. 

To estimate R, select a sample^) of size n (fixed size) from N units in the 
population following any probability sampling design/?(.s). Observe (y,, x ,), i = 
1,...,«, on the sample units mi, M2,..., u n . It is natural to estimate R by 

R = Y/Xor = Y/X 


where ( Y , X) and (Y, X) represent estimates of population means and totals ofy 
and x respectively. 

In case of simple random sampling design the estimate of/? with a sample 

A A 1 

A — _ 

of fixed size n is given by R = Y / X = y/x = r (say), ratio of sample mean ofy to 

that of x. (or ratio of sample total ofy to that of x). 

Another estimator of population ratio/? may be computed with the help of 
mean of ratios in the form 


- 1 A 


n) 

Xj) 


1 n 

= -Yri (say) 
«/=i 


For any sampling design the ratio estimate of population mean and 
population total can be computed as follows: 

V r =(Y/X)X- Y r ={Y/X)X 

A notable disadvantage of using ratio method of estimation to estimate the 
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population mean/total of y is that it presupposes advance knowledge about 
population mean X or population total X which may not be always readily 
available. If Tand X are approximately proportional, Y / X would vary little 
from one sample to another and hence would have high precision. 

In the following sections we give the treatment of ratio estimators under 
SRSWR or SRSWOR design unless otherwise stated. 

8.2.2 Expected Value and Mean Square Error of the 
Ratio Estimator 


a) Simple Random Sampling: Under SRSWOR, the expected value of r is 


E(r)= E(y/x) = 


1 

.M 


1 


s l JC> 


"A" 

V n) 



l n) 


Sr ly, 


n S X les 


n 


1 N i 

7a I ^ 


fN\ 


' nj 


;=i 


seS X 
i es 


showing thereby thatr is a biased estimator of R. 


N N N 

MSE(r) = E(r -R) 2 = Y j Tf 2) Y 2 + £ T-f Y{Yj - 2 R £ Y t +R 

(=1 i* j i=l 


where 


r/ 1 ) =. 


l 


n 


N\ seSx 
i es 

n) 


zi,r (2> = 

_ P * 


n j 


1 z ^ 

s eS 
i es 


( N\ 


\xJ 


KnJ 


7// 2 > =. 


n‘ 


1 S =7 

seS X 2 
i.j es 


(N~) 


\nj 


These expressions, which are computationally not advantageous, are due to T.J. 
Rao (1967). 

An alternative and more useful form of the expected value, leading to a 
neat expression for the bias, is given by 

Bias (r) = E(r) -R = E{y/x) - E{y)/E(x) 

= - {MX) cov {y!x,x ), 


Further, 


Bias(F) = E(r ) - R 
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= E[\/nZ ( yi/xi )] - E{y)/E(x) 

= E(y,/xi) - E(y,)/E(xi) = (-MX) cov(y,/x„ x,) 

Thus, both r and r are in general biased unless the relationship between;; and* 
is a straight line through the origin, when the bias is zero. 

Using Cauchy-Schwartz inequality, 


X\ Bias(r)| = |cov(r,x) \<yjV(r)V(x) 


and 

X\ Bias(r)| = |cov(y,- /x,-, x { )| < y/V(y t / x,- ) F(x ; ) 


As n -> oo, V(r) and V(x) are of order 0(l/«) and so Bias ( r ) becomes very small, 
so as to become negligible. But this is not so in case of Bias (r) and in fact it has 
constant bias independent of«. Further, 


|Bias(r)| < /F(x) 
S.E.(r) ~\X 2 


7(1 -n/N)C x /Vw, 


where C x stands for the coefficient of variation ofx. Hence, the bias may be 
assumed to be negligible if the C.V. ofx remains sufficiently small, say, less 
than 10%. 


Approximation to Bias and Mean Square Error (SRSWOR) 


Write r = y/x = L(\+z/Y){\+ blX)~ x 
X 

where s, = y — Y and 8 =x-X. 


Now r can be expanded in power series of 5/X if \8/X | < 1 for all the 
samples. Thus, under this assumption, 


(N" 


\n. 


r-R = 


_ __ ___ __ _ ° u 


y x) x\y x 


I--A + 


X 2 

2 k-l 


Y X. 


X 3 VY X 


X. 


e 

J 


d_ 

X. 




Taking expectation of both the sides and to first order approximation, 
E(r-R) = R[E (b/X) 2 - E (e/T ■ b/X)\, 
retaining terms upto 0(1 In) as (David and Sukhatme, 1974) 

0(n~0+0/2),if j + t is even 
[ 0 («-b+'+i)/ 2 ),if / i s odd 


E(b s E ( ) = 
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Thus, 


E(r -R)= R[(N- n)/(N - 1)«] (C 2 o - Cn) 

= Rfl-lW-pC y &], 


n N 

J___P 
n N; 


1 1 


X 2 N 


— fx^RX'-Y,), 
- 1 1=1 


where 


{\/N)Y {Xi-Xy^i-Y)' 

C M = - -- ^- V _ ■ _ -- , Cx = S x /X, C y = Sy/Y 

X s Y 1 

and p is the correlation coefficient between y and*. 

As the approximate bias is of 0(\/n), the variance and mean square error to 
this order are equal. To have an approximate expression for the MSE to 0(1 /h), 
we have 

MSE(r) = E(r - Rf = R ^ N T n ) I [c 02 - 2Ci 1 + C20] 

(N - 1) n 


= R 2 1 — 

n NJ 


[Cy 2 - 2p Cy C X + Cx 2 ] 


1 (\ 1 


A 


X 2 Vn NJ 


N 


1 N 

— Y i -RX i ) 2 . 

~ 1 1=1 


Note 1 . The biases and mean square errors of the estimate of population mean 
(Y r )and the estimate of population total (7 r) may be either derived independ¬ 
ently as in case of r from the expressions for bias and mean square ofr as 
follows: 

i) Bias (Yr) = X Bias (r) and MSE (Yr) = X 2 MSE (r) 

ii) Bias (Yr) = X Bias (r) and MSE (Yr) = X 2 MSE (r). 

Thus, considering terms upto 0(l/«), 

MSE (7k) = V(7r) = [ S/ + R* S x 2 - IRpSySx] 

n 

^—-X-YWi-RXi) 2 , where/= n/N. 
n N -1 

A 

The approximate expression for the MSE of Yr is obtained by multiplying 
/V 2 with the MSE (T r). 
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Note 2: The infinite series is convergent if and only if|8 /X 1 < 1, which reduces 
to condition 3c < 2X implies x(2n/N - 1) + 2(1 -n/N)x' > 0, where* is the mean 
based on (N-n) units not included in the sample. The above inequality always 
holds good if n > N/2 (Koop, 1972). When x< 2X does not hold good for 
all samples, Prabhu-Ajgaonkar and Deshpande (1985-86) have used a 

transformation to propose a modified ratio-estimator of the form Yr^pd) = 
y(X + a)/(x + a), where a is a predetermined scalar. 

The infinite series expansion becomes convergent ifa is chosen as 

a = Xmax ~ 2X 

where X ma x being the maximum value of* in the population. The technique 
makes use of the inequality max (X) <X max . 

Note 3: An alternative technique to derive approximate expressions for bias and 
m.s.e of r is to expand r = y/x =J[y, x) at the point (T, X) by Taylor series and 
find the first order Taylor approximation of the function f neglecting the 
remainder term. Taylorization makes the non-linear function expressible in an 
approximate linear form and thereby makes it simple and possible to find 
approximate expressions for the design bias and design mean square error. The 
general applicability of‘Taylor linearization technique’ in survey sampling is 
given by Tepping (1968) and Woodruff (1971). The Taylor series linear 
approximation is based on the assumption that r = J{y, x) is a continuous 
function having continuous derivatives of order 2 in an open sphere containing 
(y, x) and (Y, X), Thus, we may write 


r- R =fiy,x)-R 

=(y-y) f— 

V J 5 y 


+ 


(*-*) 


(y,x) = (Y,X) 


V 

L 8 ^_I0 = 


+ f{y,x;Y,X) 


where 


f(y,x,r,X)=j(y-r) 


5 2 / 

+ -(x-X) 

~8 2 /' 

Sy 2 

2i 

(y,x) = (y 0 ,x 0 ) 

53c 2 


-Hy>x)=(yo>x 0 ) 


+ ±(y-Y)(x-X) 


s 2 / 

8 y 8 x 


- i (y.x) = (y o.*o) 


yo lying between y and Y andxo lying between^ andX 

As the Taylor series expansion is derived under infinite population model 
with continuous random variables, its validity along with conver- gence 
problems for the finite population model is often subject to criticism. However, 
experience with large scale complex surveys has shown that the first order 
approximation often provides satisfactory results and for highly skewed 
population, the approximation may not be satisfactory. Second and higher ordfcr 
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approximations are also possible by extending Taylor series expansion to 
higher order terms. 

However, to second order of approximation, and (David and Sukhatme 
1974), 


Bias {r) = R 


(A r-n) 
(N - \)n 


(C 20 -Cn) ( C 2 i - C 3o) 


+ 3N(<N " 1 } - (CJ 0 -c 20 c, J) 

(. N-2)(N-3)n y 20 \ 


MSE (r) = 

——77“ (^20 ~2Cii + C 02 ) + — TT” V 2 ^ 2 ! ^ ,2 C3 °/ 

(N - l)n (N-2)n 


Accuracy of the various approximations to the bias and mean square error has 
also been studied by David and Sukhatme (1974). Of course the accuracy of the 
approximations would depend on the sample size and the population under 
study. 

The expressions for bias and mean square of r can be put in simplified 
forms by expressing bivariate moments in terms of bivariate cumulants and 
using notations Cy = Kij/XY), where Ky is (/,;') th cumulant of* andy and G t = 
[(I/O - (1/A/')]. Thus, following Tin (1965), we write E(r) and MSE (r) to 
0(1 In 2 ) as 

E{r) = R[ 1 + 0i(C 2 o - C'n) + (02 - 30 i/A/)(C 2 i - C 30 ) 

+ 30i 2 C '20 (C 20 -C 11 )] 


MSE(r) = 

R 2 [0, (C 20 +C' 02 -2C(i)+ 0 2 (9C' 2 -\SC' 20 C' U +6 C[ 2 +3 C' 20 C' 02 ) 

-2(02 - 30, /A0(CSo-2CJ, +C| 2 )] 

Under bivariate normality of <x, y) and to terms of 0(l/n 3 ), 


E(r)=R 


l+^Qo-C,’,) 

n 


( \ r< 15T' 2> \ 

1 1 3 C 20 t 1 -^20 ’ 




n 


n 2 j 


MSE(r) = 
R 2 


n 


1 


(CJo +C|) 2 -2C{,) + ^ (9C'5 -ISCioCi, + 6C,' 3 + 3C^ 0 Q 2 ) 
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+ (75C'2 -144 C 20 Cn + 54CJ, 2 + 15Q 0 Co2 ) 

n 2 


Note 4. Comparison o/Yr with y: In large samples with SRSWOR the ratio 
estimator Yr has variance smaller than the simple expansion estimator y if 

p>/ 2 (,Sx/mS/Y) = X (C x /C y ) 

As the inequality depends on the advance knowledge ofp, C y and Cr, it 
cannot be used in practice, even in large scale sample surveys. However, ifC* is 
supposed to be much smaller than Cy and the correlation coefficient is guessed 
to be very high either from past surveys or from pilot surveys, the ratio 
estimator may be the appropriate choice compared to the simple expansion 
estimator y. In small samples it is really difficult to guess the efficiency of the 
ratio estimator unless there is reason to believe that the relation bet weeny' andx 
is approximately linear passing through the origin. 


Note 5: Comparison of Ratio estimate based on SRS with estimate based on 
PPSWR scheme : With large sample approximations and assuming n/N to be 
small or in case of SRSWR 


V(Yr )■ 


N : 




n 


Further for PPSWR 

1 h (Y- V 

V(Yppswr )=~ ^ Pi ~ ~ y > with P ' = Xi!X 
ft | = ] J 

From the expressions for V(Yr) and V{Yppswr) it may be seen that for 
some populations the ratio estimate is superior and for some other populations 
PPS estimate may be superior. Using the model 

Yj = YPi + ei, (/=1,...,A0, 

where e, is independent of Pi, 

E(e,\Pi) = 0 and V(e\ Pi) = a.PP, g > 0. 

Cochran (1953) has shown that 

V(Yr) - V{Y ppswr) = E(Pfi) - MNE{PF X ) 

= E(Pfi) - E(Pi) E(Pp- ] ), since E(P,) = UN 
= cov (Pi, Pj8~ [ ) 

Thus, 

i) if g = 1 the covariance vanishes and the ratio estimate is BLUE, 
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ii) ifg > 1, the covariance is positive and the ratio estimate is inferior to 
PPS estimate, and 

iii) if 0 < g < 1, the covariance is negative and ratio estimate is superior to 
PPS estimate. 


Foreman and Brewer (1971) have made use of heteroscedastic super¬ 
population model to compare the efficiency of ratio estimator based on SRS 
with those of PPSWR and PPSWOR estimators. 

They considered a class of populations, each consisting of/V identifiable 
units with measures of size .A"/ (/ = 1, 2, . . N) constant from population to 
population. The Y, values of the study variable are related to these measures of 
size by the relation 

Yj = a + pA, + e h 


where a and P are constants over all units and all populations and the e, are 
independent random variables obeying 

E m (e/IXi) = 0, E m (ei 2 \X,) = a, 2 and E m (e/ej \X/,Xj) = 0 for i * j 


Taking expectations of V(Y ppswr) and V(Yr) over the model 


E m V(Y r ) = 


-2 


N 2 

~Nn 

( N 

l 

V/= 1 


2 N 


ry _ 

§ tE <*.-*> 2 + 


u 


N 


x 2 + £x 2 

1=1 1 J 


'X 2 


N 

E-f 


1=1 


Z */ — 

^ X 


and 


E m V(Y ppswr) -a 2 -Y (x 1 - X 1 ) +—£ — j- -Z CT / 

n ft v 7 n X / n ft 

m2 1 _ Y N CT? 1 N 

S — — J- Z (^ ,• - X ) 2 + — z -L. — i I a ? , 
n N X 2 N Xi n 


X <*• 1 

A v l 1 


using approximation 


£ ±-l),J-y { x,-x) 2 . 
tAxi x)x 2 h 


Now, Em V (Yr) - Em V (Y ppswr) 
N 2 


cov (x,x g ~ ] ), taking a 2 =a 2 xf 


n 


Thus, PPSWR estimate would be more efficient than ratio estimate ifx/S- 1 is 
positively correlated with*,, i.e. ifg> 1, the criterion put forward earlier by 
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Cocharan (1953). However, Foreman and Brewer’s result is based on a more 
general model. The same conclusion is obtained when ratio estuna,or ,s 
compared with generalized Horvitz-Thompson estimator (Foreman and 
Brewer, 1971). 


b) Stratified Random Sampling: Let us consider the estimation of population 
total Y. Two methods are available to obtain ratio estimator of population total 
through stratified random sampling. Assume that the finite population is 
divided into k strata. 


i) Separate Ratio Estimator. Ifyh and xh are sample totals computed from the 
ht h stratum, the ratios yh/xh are computed separately from each stratum and with 
the knowledge of population total ofx, that is, Xh for the ht h stratum, the 
separate ratio estimate of the population total y is 


Yrs = Yj~ Xf ’ = 'L^y >' IXh ) Xh 

h= 1 x h h=\ 


ii) Combined Ratio Estimator. This is computed from the estimated totals Y and 
X from the stratified sample as 

Yrc = (Y/X) X = (.Nyst /Nx.st) X = (y, t / x xt ) X 

Both T/wand Yrc are biased estimates. However, the combined estimate is 
much less subject to the risk of bias than the separate estimate. For large number 
of strata and small stratum sample sizes (nh), the bias in Yrs may not be 
negligible in relation to its standard error, even if the total sample size (n) is 
large. The danger of bias in separate ratio estimator Yrs should be taken note of 
if C.V. ( xh ) exceeds say, 0.3 (Cochran, 1977). The bias in Yrc is likely to be 
negligible in relation to its standard error provided C.V. ofx. v , is less than 0.1. 

Assume sample sizes are large in all strata. Let Syh and Sxh be stratum 
standard deviations for>- and x in h th stratum, p h be the correlation between^ 
and x in the ht h stratum 

Rh = Yh/Xh and f, = rth/Nh ■ 

Then, 

nr«s ) £ I -2R hPh S yl ,S xh ] 

h n h 


Further, if total sample size n is large, 

V(Y rc )=S N (1 ~ / * } +S*S*,-2Xp t S,tS*] y 
h n h 


where R = Y/X. 
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To compare the efficiencies of Yrs and Yrc, we may write 

V(Yrc) - V(Y rs) 



N 2 h d~f h ) 

«h 




+ 2(R i -R)p h S yll S xh ] 



N l (1-/*) 

«* 


K * 2 -** 2 )^ 




In situations in which the ratio method of estimation is appropriate in each 
stratum, the second term inside the square bracket is likely to be small and 
vanishes if the regression ofy on x in each stratum is linear and passes through 
the origin and consequently the separate ratio estimator is likely to be more 
precise than the combined ratio estimator, unless./?/, is constant from stratum to 
stratum. With small sample size in each stratum, combined ratio estimator is to 
be recommended. Alternatively, one may also use Hartley-Ross unbiased ratio 
type estimator or almost unbiased ratio type of estimator in each stratum or in 
some of the strata to avoid the effect of bias itiYrs . 

c) Systematic Sampling: Swain (1964) has derived relevant expressions for 
bias and mean square error for ratio method of estimation in systematic 
sampling with N = nk, k being the sampling interval and n the size of the 
sample. Thus, we define the ratio estimator of R with ith systematic sample as 

a 

R = Yi / *1 = Ysy !x S y 

The approximate bias of R, is 

* 1 1 k 

B (R,) = -=^~Y J *l ( R *i - Yi ) and the variance up to 0(1 In) is 

A K / = I 

V ^ S ^ S + ) + R2S x (l+»-l Plw ) 

-2RpS y S x ^(\ + n-lp wy )(l + /i-lp WJC ], 

where p wy and p wx are intraclass correlation fork’s and jc’s respectively. When 
Pm *y ~ Pwx = pw, 

Y(R S )= *L[ C *-2pC,C x +Ci][l + ^Tlp w ] 


Thus, 1 would be more efficient than R m = r if the units are so arranged to 
make the intraclass correlation coefficient p w negative. 8 
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A — — 

Further, R s would be more efficient than R * =y sy /X, if 


P> 


1 C x 1 + n-lp 


WJC 


2 Cy y 1+n-lp 


wy 


Mishra and Sukhatme (1972) have investigated the use of cluster 
sampling in forming ratio estimator. 

d) Unequal Probability Sampling: If sampling is done with unequal 
probability with replacement, say under PPSWR scheme 


RpPSWR / J] — >B(R PPSWR ) = 

/=] Pi 1=1 Pi 


1 1 f^iXi/PiHRXi-Y,) 

i=l 


n X 2 


V(Rppswr) = --jy i ~ RXi ) 2 

The efficiency of ratio method of estimation depends on appropriateness of 
probability of selection for estimating Y andX and positive correlation between 
yip and x/p. 

In unequal probability sampling without replacement the generalized 
estimator of population ratio is 

/K A A 

Rht = Yht/Xht with exact bias given by 
B(Rhj) = (- MX) cov (Rht, Xht) 

The approximate variance is 

Wht ) = ) + R 2 V(X ht )-2Rcov {Y ht ,X ht )] 

= _L 


X 


.»=i 7t i ) 


8.2.3 Estimation of Variance from a Sample (Simple 
Random Sampling) 

Expressing the approximate variance of a ratio estimator^ = r in the form 

two estimators of V(r) may be given by 

1— y 1 J n 

V{r)= ~JJ~ l 'L ( y< - rx > ) 2 . 'fx is known, and 
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• i— f i i « _ 

v *( r )=-—-Y (*-*,)’ , if X is not known. 

n x 2 n-\fr{ 

Accordingly, we write the estimates of variance ofK(T/?) as 

vo^bl-L^o,.^ 
n n-l/r, 

Vi (F* £(y, -rx, f, Wu (1982), 

n x n-ifT] 


and v 2 (y* ) = 


i -/ 


n 


m 2 

V x ) 


X (yt - rx i ) 2 * Ha J ek (1958a) 

n “1 i=i 


Alternative expressions for above estimated variances may be written by 
expressing [l/(« - 1)] Z (y, - rxi) 2 in the form 

7 --r £ O'/ - rxi) 2 = s y 2 + r 2 s x 2 - 2r p s y s x , or 

(n-\) 




x ) 


Assuming simple random sampling P.S.R.S. Rao and J.N.K Rao (1971) 
have derived exact expressions for the biases of vo and v 2 under an infinite 
population linear regression model 

f, = a + (3x, + g, 

E(e, | Xi) = 0, V(e, \ x,) = a 2 xf 
E(e&i | xpcj) = 0, i ± j 

for general value of of t when x has a gamma distribution with parameter h. 
They have also obtained mean square errors ofvo and v 2 . Their conclusions are 
(i) v 2 is less biased than vo for 0 < / < 1.5, (ii) v 0 is less biased than v 2 for t = 2, 
(iii) vo is more stable than v 2 when t = 0 or 1, (iv) v 2 is more stable than v 0 when \ 
- 2 and a = 0. 

Hutchinson (1971) made use of the above model to comparevo and v 2 with 
x having a log normal distribution. Using the above model witha = 0, Royall 
and Eberhardt (1975) have suggested a modified estimator of variance 
(adjusting bias) as 


n j 


x r X 


v// =v 0 -3 


where c x is the x sample coefficient of variation andx c is the mean of the (N - n) 
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units not in the sample. As Vh = V 2 for large n and N>>n, they have justified v 2 
as an estimator from the super population view point. The estimatorv// is model 
unbiased with t = 1 and remains approximately so when the variance ofe, in the 
model is not proportional to Xi. An empirical study on vh has been reported by 
Royal! and Cumberland (1978a, b). Royall and Eberhardt (1975) have also 
showed that vh is asymptotically equivalent to jackknife variance estimator 
(Tukey, 1958) given by 

D 2 . 

Vj =(1 ~f)x 2 (A! -1)1——, 

s n 


where for every j in s, D(j) is the difference between the ratio (ny-yj)/(rix - xj) 
and the average of these n ratios. 

Royall and Cumberland (1978a) have suggested another estimator 

11 X 1=1 



- ^ 


where d, -y,- rx h which is also unbiased under the model witha = 0 and t = 1 
and further asymptotically equivalent toy/. 

Royall and Cumberland (1981a) have made an exhaustive empirical study 
of the conventional ratio estimator and estimators of its variance and have 
compared with the theoretical results. 

Wu (1982) has considered a general class of variance estimators 

v g = {X/xY vo 

to make a comparison on the relative merits ofvo, vi and V 2 . By minimizing the 
leading terms in the expansion of MSE he obtained optimum value ofg and 
showed that among vo, vi and V 2 , vo is the best \ig op t < 0.5, vi the best if 0.5 <g op t 
< 1.5 and V 2 the best i fg (>pt > 1.5. Assuming super-population model witha = 0, 
vo is optimal among v^ under t = 0, vi is optimal among v g under t = 1. As the 
conventional ratio estimator of Tis optimal (BLUE) under the super-population 
model with a = 0 and t = 1 (Brewer, 1963a; Royall, 1970), it is desirable to use 
vi as the estimator of variance provided the statistician has reason to believe 
that the super population model adequately describes the populations. For/ > 1, 
vi and V 2 fare better than vo. 


8.2.4 Asymptotic Normality of Ratio Estimator 

A 

__ A 

For large samples Yr and R may be assumed to be asymptotically normally 
distributed and hence the confidence limits for Y and R can be obtained by such 
normal approximations: 

Y' Yr ±z V{v(T/e)} 
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R :R ±zV{v(R)}, 

where z is a standard normal deviate corresponding to chosen level of con¬ 
fidence. 

A 

Theoretical justifications for the asymptotic normality of Yr have been 
provided by Scott and Wu (1981) by proving the necessary central limit 
theorem. The proof of central limit theorem in finite population without 
replacement context assumes special importance because the assumption that 
the sample observations are independently drawn from the same population is 
violated. 

Asymptotic results for finite populations in the present context pre¬ 
suppose an infinite sequence of simple random sampling experiments, thevth 
of which has the siz qN x and with paired values (T v \,X x \),..., (Yvj^v, Xv/tv)- A 
simple random sample of sizen v is selected from the vth one. 

Further, both n x and N v - n x tend to approach infinity as v -» oo and 

lim— < '/ 

* N, /2 

A 

Write Yvr -Y v = (yjx v ) X x - T v 

= Xji v (y, - R#„) = ^ L -T Ey, 

x v n y = . 


where E VJ = y VJ - R v x VJ . 

Theorem 8.1 (Scott and Wu, 1981): Let Y vR = y v (X v /x v > Then, under 
SRSWOR 




V‘-/v)5 


vR 


L 

-> N{ 0,1) as v—> oo 


where S 2 

\>R 




The theorem can be proved easily by applying the following Lemmas. 
Lemma 1: Under SRSWOR 



Vk/; 



L 

-> N(0,1) as v-> oo, 


if and only if£ V / satisfies the Lindeberj-Hajek condition 

I Elj/(N v -l)Sl y =0 

T v ( 8 ) 


lim 

V—» oo 
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h ; 


for any 5 >0, where 7V(5) is the set of units in C/ v for which 

. ‘ ' i 

- >5yfn^ 

fv $vy 

Lemma 2: Suppose {X V j} satisfies the conditions (1 -fv)SPvx/nv 
Then, under SRSWOR 

; ■ i 

i) E\x v - Jv|->0 

p _ 

ii) x v -» X v as v-» oo . 


0 as v -> oo 


8.2.5 Model Free Prediction 

The structure of the ratio estimator of population total can be justified from the 
consideration of model free predictive estimation having an appealing property 
called ‘internal congruency’ by Sampford (1978). For any estimator of Y= X 

ies 

Yi + .X Yi, we need to predict unknown part X Yi, with the help of a 

I €5 

plausible estimator (predictor) depending on sampled units. If ratio 
method of estimation is to be justified, Y/Xi is expected to be approximately 
constant. Thus, it would be reasonable to estimate Y t in the non-samples units of 
the population by X^y/x). Therefore, I Y, is predicted by ZX,(y/x), 

ies 

provided that the sampled) is a simple random sample. Thus, 
y -= Z Y, + £ X, or: X ) = ny + I £ X, = | X . 

ies ies x ies x 


8.2.6 Unbiased Ratio Type Estimators 

We have seen that r is a biased estimator and E(r) can be expanded in the form 
E(r) = R + (b\/n) + (b 2 fn 2 ) + ... 

The methods have been proposed in the literature to make/* completely 
unbiased either by changing the form of estimator or by taking recourse to a 
selection procedure other than srs. 

Hartley-Ross Unbiased Ratio Type Estimator 

Define F = -Y ^= -1^ 
nfz f n Xj 

Bias (r)=E(r-R) = E(rj-R) 
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= E 


yj__X_ 
X , X 


- ±eiY-x( yi ix ,)] 

A 


1 


my 


V V . - N V- 


l J 


1 

MY 


N 

^n(xi-x) 

/=! 


_ JVA _ 1 « 

Est. £»}(*(-*)=•--T/•/(*. -x) 

i=i 

=-(y-rx). 

(»-1) V ^ 

Correcting bias in r, we have an unbiased ratio type estimator for R 
proposed by Hartley and Ross (1954) 

M-f 


rtm = r + 


n 


N 


n-V X 


-=[y-rx]- 


Further, 



r 


= rX + 


N-l) 


\ N J 


n 




u-i; 


(y-rx). 


The exact variance of Y^ has been derived by Robson (1957). For large/? and 
N » n, Goodman and Hartley (1958) have shown that 

A. 

V(Y hr) = (l/«) (a/ + R 2 <j x 2 - 2R pa^ a*), 
where R = 1/ME(T/Y). 

Now, V(7r) - V(7 H r ) = Win) [(R - p) 2 - (R - P) 2 ], where P is the 
regression coefficient ofy on x. 

. ' tA ^ 

ThuSjJo first order of approximation, Yhr is more efficient than Yr, if p is 
nearer to R than to R. Goodman and Hartley (1958) have stated that the 
regression coefficient p is often found to be closer to R than to R. Further as 

pointed out by Tin (1965), Yhr is not asymptotically the minimum variance 
bound estimator (assuming y, x to follow bivariate normal distribution) as 
verified in case of r. Morever, thr is not so convenient in practice for 
calculation. 

Mickey (1959) suggested another estimate which may be derived as 
follows (Cochran, 1977). Compute n ratios = I ' yj IZ'xj, removing each unit 
in turn from the sample. The summations' is over remaining (/? - 1) units. 
Denoting r\ = (1 In) Mickey’s estimate oiR is then given by 
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r M =n 


n(N-n + 1)._ 

+ —-=- -Xy 

NX 


r\X ) 


If we split the sample into g groups each of size/?, where n = pg and we 
omit the/th group containing/? observations successively to compute ratios, the 
Mickey’s unbiased estimator takes the form 

r M =? i + j(y - w) 


This is a slight deviation from Mickey’s original estimator in the sense that we 
have averaged over the g groups rather than over all possible groups of size/?, as 
Mickey did. 

J.N.K. Rao (1967) studied the precision of rM along with thr and showed 
that V(tm) decreases monotonically asg increases so that the optimum choice of 
g is n and further rM with g = n is considerably more efficient than rHR for n> 2. 

Williams (1961) has proposed a general method of generating unbiased 
ratio estimators so as to be able to remove the inherent bias in the conventional 
ratio estimator. This is discussed as follows: 

For any constant a, the estimator t= y-a{x-X)'\s unbiased for srs of 
size n. Suppose now that the sample of n observations is split at random into 
two parts - the first/? observations in the order of drawings and the remainder (p 
- p) observations. Now use the sub-sample to determine a(/?) and to calculate 
y n .p and x n . p for the remainder sample only. 

Since the remainder sample can be regarded as a random sample from the 
XN - /?) population units, we now construct an unbiased estimator in the form 
mentioned above for the mean of remainder populations. Thus, we have an 
estimator for the mean of the remainder population given by 


t N - p 


ny - py P 

n-p 


-a(p ) 


nx - px p 
n- p 


NX - px p 
N-p 


which is unbiased for (NY - py p )/ (N - p). _ 

Now, the unbiased estimator of over all population mean Y is 


tw 


_ [(N - p)ts- P + py P ] 


N 


(N-p) 


n 


N J L(«-P)J 


[y-a(/>)(x-A'] - 


N-n 

N 


P 

(n-p) 


[y p -o-(p)(x p -X)] 


The choice of p in the above formulation is arbitrary and 1 < p<n- 


1. 
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The function a (p) of the sub-sample is also arbitrary. The bias inf w arises when 
a depends on the same set of observations, calculating y and x. The method of 
Williams delinks this to generate a general class of unbiased estimators. 
However, these estimators possess an undesirable feature in the sense that they 
depend on the order in which the sample is drawn. This can be overcome by 
averaging tw over n\ possible orderings of the sample for a possible simple 
form. 

Choosing a (p) = \/p I y/xi and p = 1, 

- y x N -1 n f _ y\ 

t w =X^- + — - \ y -x 

X\ N n - 1V X\ 

Averaging over all n\ orderings of the sample 

- N- 1 n -_s 

t w =rX + —- - -(y-rx) 

N n- 1 


a 

which is the Hartley-Ross unbiased ratio estimator Y hr- 


8.2.7 Elimination of Bias of Conventional Ratio Estimator by 
Modifying the Sampling Scheme 


The procedure consists in selecting sample(s) with probability proportional to 
its aggregate size (PPAS) and is attributed to Lahiri (1951), Midzuno (1950) 
and Sen (1952). This is achieved by selecting the first unit with probability 
proportional to size measure* and the rest (n - 1) units with equal probabilities 
without replacement. Thus, 


p(s)= 


z 

1 

* 

1 

NX 

<N-i\ 

~~x 

'N' 


k n- lJ 


v n) 



Now 


E(Y K )=E[(ylx)X] 


y v x l 


(N]X X 

'N' 


'N' 

l n J 

Kn/ 

LJ 




V(Y R )=E[(ylx)X]>-Yi 
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= Esrs ly* ^ 

s (1 -/) [5(i 2 + R 2 Sx 2 -2RpS y &], to terms of 0(l/n). 
n 

. • j Avnrp^sion for the variance of ratio 

T j Rao (1966b) has derived an exact expression iui 

eSt,m CtcreSr2 e e S ., m ator under Lahiri-Midzuno-Sen (LMS) 
scheme is unbiased and having its variance upto 0(1 h) equal to thatun er the 
simple random sampling scheme. R. Singh (1975a) has shown that under bi¬ 
variate normality of (y, x), considering terms upto 0(l/« 2 ) Yr under LMS 
scheme is always more efficient than under SRSWOR scheme. An unbiased 

estimator of V(Yr) under LMS scheme is given by 
v(t R ) = Est. E(Yr-Y) 2 = (Yr) 2 - Est Y 2 


JK ) 2 

X 


1 


tX' 


nN\x) 


N -1 




using E [ty?/N'p(s)\ = £ Y 2 


i-i 


and 


where 


v YiYj 
i*j N"p(s ) 


= ^Y i y h 

i*J 


N' = ( N lN |and^" = 
\n-lj 


r N -t 

<n- 2) 


This estimator of variance may assume negative values. T.J. Rao (1977a) has 

/V 

studied the problem of estimating the variance o iY r under LMS scheme and 
has derived the sufficient conditions for which variance estimator suggested by 
him is always positive definite. 

Lahiri-Midzuno-Sen scheme which makes the conventional ratio 
estimator Yr =f y (X/x) unbiased pre-supposes the positive correlation between^ 
and x. If the study variate is negatively correlated with*, Deshpande (1982) 
suggested a ratio type estimator 7dr = y (X/x*) where x* = (NX-nx)/(N - n), 

which remains unbiased under Deshpande (1982) scheme, which consists of 
rejecting the first unit selected with probability proportional to size (p, = xJNX) 

and then selecting a SRSWOR sample of size n from the remaining (N - Y) 
units. This sampling scheme gives 
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p(s)= 


1 

1- I Pi 

i es 


l 

(N-Y\ 


r N - f 

< n ) 



\ n J 


nx 

~NX 


V(Ydr )~ 


N 


(N\ 


\nJ 


— * 

X 


X-Y 2 , and its estimator 


Est v(r DR ) = (r Dg ) 2 - 


X 


nNx * 




8.2.8 Product Method of Estimation 


Goodman (1960) and Murthy (1964) have proposed another type of estimator 

A 

p, termed as product estimator, defined by Y p = y x/X. Consequently, the 

a _ 

estimator of ratio is R p = yx IX 2 . 

— _ — __ A 

Now, E(Y P ) = Y[\ + cov ( y,x)IXY], showing thatfp is a biased estimator Y 

and the bias is of order (l/«). This bias can be reduced to a lower order by 
Quenoulle’s method of bias reduction. However, the exact bias can be 
estimated to obtain an unbiased product estimator of the form 


Yup - y 



1 (Lz/>, 

X n xy 


where s xy is the sample covariance. To first order approximation, 
nK) = Y(7 1 up) = ~~ [V + Sx 2 + 2RpSySx\. 

Ac 

Comparing V(Y p ) with Vfy) under SRSWOR, we find that 7 is m n r P 
efficient than ifp indicating thereby that 

correlated, product method of estimation should be considered provided th y 
are enough evidences in favour it as regards the 

also biat? = ^ Zy ‘ * h Which is 

(Robson, 1957) y Mt ' 0n leads t0 an “biased estimator 

)nyx_(N z n) ly lXj 
N (»~ 1 ) X N ' 

negatively cornelatedTtommze m 

ponding estimator is F„ = p ( z/' the COrres - 
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Agarwal and Jain (1989) considered such an estimator defined as 

Y'p = y (xh/Xh), 

where x H and X H are sample and population harmonic means respectively of 

x-values. Y'p is optimal in the sense of model unbiasedness and mean square 
error under a super population model in which E m (Yi\Xi) - P fxi and V m (Yj/xj ) = 
X/x,. They also compared its/?-bias and p -mean square error with those of Y p = 
y{x/X) and Sampford’s (1978) estimator 

y„ ny ^N-n ) 2 xy 
p N N NX-fix’ 


8.2.9 Dual to Ratio Estimator 


Srivenkataramana(1980) has proposed an alternative estimator, which is said 
to be dual to conventional ratio estimator Yr, when y and x are positively 
correlated, given by Ys = y(X*/X), where X* = {NX- nx)/(N - ri). To first order 
approximations, 


V(Y S ) = Y 2 


( 1 -/) 

n 


[' C y 2 - 2g Cyx + g 2 Cx 2 ], where g = n/{N - n). 


Comparing F.v with Yr and , it may be seen that Ys is more efficient than 
both f/eand y\^Y 1 g < k< ]/ 2 {\ +g), assuming (1 -g) > 0 that is, N>2n and with 

k = C V x/C X 2 . 


In case y and x are negatively correlated, the dual to product estimator is 
obtained as a ratio estimator Y' s = yQtiX*). Srivenkataramana(1980) has, also 
discussed the optimality of estimators suggested by him. 


8.2.10 Almost Unbiased Ratio Type Estimators (Estimators 
with Bias of 0 (1/a? 2 )) 

a) Quenouille’s ratio estimator. Quenouille (1949) originally introduced the 
technique of jack-knife (the term due toTukey, 1958) as a method of reducing 
bias of the serial correlation coefficient. In a subsequent paper Quenouille 
(1956) generalized his technique of bias reduction in an infinite population 
context^Durbin (1959) applied the Quenouille’s method of bias reduction to the 
estimation of ratios and its general version is given below. 

Divide n observations into g groups of size p and naturally n = pg 
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Define r(j) = L'y/Z'x where h'y and E'x are total of sample values omitting 
theyth group. Thus, compute pseudovalues 


n = gr-(g- V r (j)> 




Quenouille’s(1956) estimator is ro= — rj =gr-(g - l)r g , 


^gJ 7=i 


where r g = (1 /g)£ r<jy 

As known, the expected value ofr is of the form 
E{r ) = R + (a\/n) + (ailn 2 ) + .. ., 


Similarly, 

E(r u) )=R + 


b i 


(g-l)P (g-l) 2 P 2 


+ . 


It may be verified that Bias ( ro ) is of the form 0(1 In 2 ). In SRSWOR if the 
sampling fraction f=n/N is not negligible, the leading term in the Bias (r) is 
of the form a\ (1 -f)/n. Thus, to remove both the terms in 1 In and MN, the 
psuedo values are set as r/ = wr- (w - 1) where w = g [1 - {n -p)/N]. 

Altern atively, one may follow Wolter (1985) to replace r<j) by r'(/) = r 

+ V( 1- /) ( r U) - r > 

For g- 2, Durbin (1959) demonstrated that under super population 
model, Y, = a + (k, + e/(i= 1,..., A), E m {ei | Xi) = 0, V m (ei | x,J = a 2 , cov m ( e h ej \ 
Xi, x,) = 0 (/ *y), x following either normal or gamma distribution,Quenoulle’s 
method of bias reduction not only reduces the bias but also the variance. 
P.S.R.S. Rao (1969) investigated the relative efficiencies ofr and ro withg = 2 
and V m (e,\x,) = a 2 x? under the Durbin’s model. Rao and Webster (1966) 
showed thatg = n is the optimum choice for ro under Durbin’s model. 

The jack-knife estimator of variance is v g = [l/g(g- 1)] L(r/- ro) 2 , or 

g 7-1 


Alternatively, one may use v g * = [l/g(g - 1)] I(r/ - r ) 2 . P.S.R.S, Rao and 

J.N.K. Rao (1971) have studied the biases and stability of v g and its other 
competitors 



f 1 > 

1 ) 

\nX 2 ) 

VW-1/ 


v* (r)= 



vii-X 


'Z(yi-toi ) 2 



and 
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under the infinite population linear regression model for general values oft. 
They have also made a detailed study on the merits or r , rtm, and ro by 
comparing their mean square errors for general non-negativet. 


b) Modified ratio type estimators (almost unbiased): As derived earlier, the 

a. 

expected value of r(= R ) to 0(1 In) is 


E(r) = R + 6\R 


X 2 XY 


where 0j 




Estimating the first order bias from the sample and subtracting it from the 
simple ratio estimator r. Tin (1965) defined a modified ratio type estimator of 
the form 


r T -r 


( s *y 

2 A 
Sx 


x 2 J 


Writing 


E(r) = R 


1 + 0 , 


Sl 


C N 
°xy 


XY J 


and dividing r by the estimate of the terms inside the square bracket, we have 
another modified ratio type estimator due toSahoo and Swain (1979) given by 


r s 


1 x yj 

The possibility of such an estimator was indicated by Koop (1951). 

Earlier, Beale (1962) proposed a modified ratio estimator of the type 

r _ r f [ 1+e i(^/^y)3 l 
^ 1 [1+01W/3c 2 )]/ 

The biases of rs, rr and r, are of 0(l/n 2 ). Comparison of biases of these 
estimators to 0(l/n ) shows thatra is the least biased andr s is less biased than n 
if the population regression coefficient ofy on x is less than the population ratio 
To the same order of approximations, n, r T and r, are equally efficient and 

ZIhTi! ha " ' (Tm ’ l%5; Saho ° and Swain > l979 )- Hie authors have 

n d o«7 rOPert ' eS 7 th6Se estimators under biv ar‘ate normality, under 

all nrl ° 9) SUPer P° PU ' ati0n model and wi, b empirical data. However, for 

asvmntnHr n UPP ° SeS ^ haVC Same aSymptotic efficiency. Like r, they are 

Rao Cr LZir m H Var l anCe b ° Und eStimat0rs of *> as determined by 
equality under the assumptions of bivariate normality of (y, x). 
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Rao and Kuzik (1974) have reported an extensive semi-empirical study on 
ratio estimators of the population mean under super population models and for 
a large number of natural populations. 

Sahoo (1987) has considered a class of almost unbiased estimators of 
which tb, rr and r s are particular cases. 

Tin (1965) has shown that if we modify the estimator rj by subtracting 
the estimate of its bias from it, the resulting estimator would be less biased, but 
less efficient than rj (under the normal distribution assumption). 

8.2.11 Modified Ratio Estimators 

Considering modification of the ratio adjustment Sri vastava (1967) and Walsh- 
(1970) have suggested generalized ratio estimators of population mean of the 
types 


'la =y 


Tf 

\ X ) 


and t 


2a 


yx 

[a3c + (l-a)^] 


respectively, a is a good guess of pC/C x - (3 /R, which minimizes the variance 
of ria or / 2a to 0(l/«). To this order, the estimators /i a and t 2a 

have same efficiency as that of linear regression estimatorf r cg = y+ $(X-x). 

In practice an optimum a is not available and therefore either /i a or t 2a 
may not be effectively used. However,ony may use a close approximation to an 
optimal a either from a pilot survey or from past data. Alternatively, one may 
use a plausible consistent estimate of a from the sample itself given by a = (3/fl 
where p = s xy /s 2 x . Sahoo and Swain (1987) have compared 


'fa = y 


Tf 

< X y 


and t* 2a =- 


yX 


oix + (\-a)X 

where a is estimated from the sample. To 0(l/„), t\ a and i' 2a have the same 
asymptottc efficiency and is equal to that of the linear regression estimator 
when P ,s esttmated from the sample. To 0(1/^) and under bivariate normality 
“ d ' 2 “ are c °nditionallty superior to the aforesaid linear regression 
es imate. T.J. Rao (1993) has shown that Srivastava’s estimator ti a may be 

called a repeated substitution estimator which is obtained by defmingF fi (') = 

Y(Xlx) ' Where K = and repeatedly substituting f R in place V of in the right 
hand side to obtain a modified estimator. This gives Y,p)(x/x) = F (Xlxf 

1"-s.° on ' Re P eatin § ‘his iteration a times, where a is an integer, gives F<«> = 
Y (X/x) a . 

With a view to improve the efficiency of ratio type estimators in the 
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presence of auxiliary information a number of modified ratio type estimators 
have been suggested in the literature. Some of these are given below.' 

Mishra and Mohanty (1978), Ray, Sahai and Sahai (1978) and Vos (1980) 
have proposed modified ratio type estimators taking the mixture of simple 
expansion, ratio and product estimators such as 


/iw - wy + (1 - xv)y ( Xlx ) and ti w = wy + (1 - w)y (x/X), 


where 0 <w< 1. 

Sahai (1979) proposed a modified estimator t m = y (* + 6X)/(X + 0x),0 
being a scalar. Ray and Sahai (1980) have suggested a two parametric families 
of a ratio type and product type estimators for the finite population mean as 


y(fcY+9jf) 

* + (£+ 0 - 1 ).^ 


and t 2 Rs -*= 


y (kx +QX) 
X + (k + Q- \)x 


y 


where 0 < 0 < 1 and A: is a non-negative integer. Hansen, Hurwitz and Madow 

(1952) su gg este d a transformation z, - ylXi and the corresponding estimate Y 'r 
= y(Z/T), which may be found to be superior to simple expansion estimate 

as 

when Yr is not. 

Shah and Adhvaryu (1970) and Reddy (1974) have constructed a 
modified ratio type estimator by transforming the values of the auxiliary 
variable as A 7 , =Xi+ kX. Mohanty and Das (1971) and Srivenkataramana(1978) 
have considered the situation where the transforming of origin and scale of the 
measurement of the auxiliary variate is utilised to reduce the bias of the 
estimate and increase simultaneously the efficiency of the estimate in case of 
ratio method of estimation. Sisodia and Dwivedi (1981) suggested another 
transformed ratio estimator using transformationJf', =X t + C x , where C x is the 
population coefficient of variation of*. Thomsen (1977) has considered a 
simple transformation by adding a positive constant£to all observations Xi and 

* A 

has suggested an estimate of the form Y"r = y(X + k)/(x + k). The optimum k 
turns out to be k* = Y(S x 2 /S yx )-X. He considers two estimators of£* as k\ = 

/V 

A — __ A 

( s x 2 /s X y)-x and k\= {s 2 x /s xy )-X. By successive substitutions, Yr (k*\) =y + 

(sx/sx 2 ) (X -X), and Y "r {k* 2 ) = y/[ 1 + (s x /s x 2 ) (X-x)/y\, whose first two terms 

in Taylor series expansion approximate it to simple regression estimate. 

Following Searls (1964), one can construct a modified ratio estimator 
with the advance knowledge of population coefficient of variation ofy and*, 
given by r = r(\ + 0\C x 2 )/(\ + 6\C y 2 ) (Swain, 1981). 

Gandge and Prabhu-Ajgaonkar (1989) formulated a Searls type ratio 
estimator of population ratio R as r* = A r, where the optimum value of X 
depends on C x , C y and p and information about these parameters is necessary for 
its application in practice. 
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8.2.12 General Classes of Ratio Estimators 

Srivastava (1971) defined a large class of estimators of Y expressed in the form 

Yr(s) = y h (x/X), where h{.) is a parametric function such that h{ 1) = 1 and 

satisfies certain regularity conditions as to continuity of the function and 
existence of certain order derivatives, etc. By expanding yh(x/X) by Taylor’s 
series, Srivastava has shown that 

E(Yr( s)) = Y+ 0(1/«) and upto 0(l/n) 


V(Yr ( s)) > [(N- n)/Nn\ S y 2 (1 - p 2 ), 

which is the variance of the linear regression estimator. 

Subsequently, Srivastava (1980) put forward another broader class of 

— — — A 

estimators defined by Y*r( s j = g{Y, X), where g is a continuous function off 
and X having continuous first and second order derivatives which are bounded 

A A 

in R 2 (T and X assume values in a close convex subset R 2 of the two 
dimensional real space containing the point (Y, X) and such thatg(F, X) = for all 

Y). The minimum asymptotic variance ofFV, is same as the variance of the 

— A 

linear regression estimator Y + P(Jf- X). 

Swain (1973) has suggested a generalized estimator of R as 

- _ {y+a(X-*)} 

°ak ~ T7-=V 

{AJc + (l— ?i)*} 

where a is either a constant or a consistent estimator of its analogous function 
of the population values ^nd 0 < X < 1. Rao (1968) considered a ratio estimator 
of the form Rho = {y + b(X-x)}/X , where b is the sample regression coefficient 
ofy on x, which is a particular case of i£ aX and compared its efficiency along 
with other estimators in case of smallish populations. 

8.2.13 Almost Unbiased Ratio Estimators Based on 
Interpenetrating Sub-samples (IPNS) 

Suppose the sample is drawn in the form ofm independent inter penetrating 

sub-samples of the same size/? following the same sampling design. LetT, and 

X, be the unbiased estimates of population totals T and X based on y'th 
sub-sample (/ = 1 ,.. m y 

Consider two estimators of the population ratio/? as 
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1 m V . 1 JO. 
mPxXj m y'=i 

where r, = Yj / Xj. Approximately, the bias of r m is 
B(r„) = -L [« K(i) - COv (i, K)] 

1 /w 

= ^— f [*F(A>cov (X,- IXJ )]=-V 2! a (0- )■ 
m 2 X 2 P 1 J m 2 p 1 , 

1 m 

Further, )=— ^ B(rj ). Thus, mB(r m ) - B(r m ). 

m 7=1 

Now, E{r m - r m ) = B(r m ) - B{r m ) = (m - 1) B{r m \ from which an 
approximate unbiased estimate of B(r m ) is Est B{r m ) = \/{m - 1) (r m — r m ). 

Murthy and Nanjamma (1959) have suggested an almost unbiased ratio 
estimator by correcting the bias as rMN — rm — Est. B(r m ). Thus, 

rMN - r m - \/(m — 1 )(r m - r m ). 

For m = n and p = 1, the IPNS estimator of ratio, in SRSWR reduces to 
r’MN = (nr-r)!{n- 1). 

In case of SRSWOR design, the Murthy-Nanjamma estimator may be 
derived as t"mn - [(N— \)nr-{N- n)r\ /N(n - 1). 

Another alternative almost unbiased estimator of the population total 
based on IPNS may be derived as follows. 

1 A 

Now, B(r m )=-—cov (rj ,Xj) 

i . / iV l1 * 

Est. B(r m ) =-Est. cov (rj.Xj) = - — -- E (r, - r m ) (Xj - X) 

X v XJ\m-V 


nf m 


XJ\m- 1 . 


(Y-r m X) 


An almost unbiased estimate ofR is therefore given by 


R* — r m - Est. B(r m ) = r m - Est. — B{r m ) 

\mJ 


= r m - - 


XJ \m -1 


(Y~r m X) 
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= r m + 


1 




(m-\)X J 

a 

With m = n, p- 1 and in SRSWOR, R * reduces to 

1 (y-rx). 


[(W-1)11 

r \ 

1 

1 N Jl 

l("-!)* J 


which is due toNiets de Pascual (1961). The mean square errors of r and r p are 
asymptoically equivalent. 

T.J. Rao (1981) has proposed a general class of almost unbiased ratio 
estimators of population total Y under IPNS set up by considering a linear 

a __ 

combination Y = [X r,„ + (1 - E(X)r m ] X, where X is a random variable. For 
unbiasedness of Y, E[X r m + {1 - £(X)}r m ] = R. That is, 

E[X r„ - E(X) T m ] = R- E(r m ) = — E[Y-r m X\ 

X 


xx) 


[ kr ni X +Y-r m X- kr nP X\. 


Now, using approximation B(r m ) = 


1 




E{r m ) = [ — [E(r m ) + {m- 1)£] 
mJ 

1 ' ] ECr m )+ i -^R = (\-c)E(TJ + cR 


mJ 


= E 


(1 ~c)r m +c 


m 

A 

Y 


, where c = ( m - 1 )/m. 


That is, r m = { 1 


c)r m + c—. Thus, 

y\ 


E[Xr ni - E(X)r m ] = — 


ri] 

E 

k r m X + Y - r m X - kX < 

(l-c)r m + c—1 

\X) 



x\ 


rn 

E 

r m X < 

" A ^ 

k + (\-kc)— 

\xJ 



X 

< j 


-r m X (1+ k (1- c)} 


= E 




k + (l-kc)—\-r m {l + *(l-c)} 


A 

Hence, X =* k + (1 — kc ) XIX is a solution, which generates a class of almost 
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unbiased estimators. Thus, Rao’s (1981) class of almost unbiased ratio esti¬ 
mators of Y is given by 

9aur = [A. r m + {1 - E(\)}r m ]X, 


where X = k + (1 - kc) — for any constant k and c = (m - \ )/m. 

X 

For k = 1/ c, 9aur reduces to Murthy and Nanjamma’s (1959) estimator 


MN ~ 


V — 
' m 


r — r 
'in 'm 

(m- 1) J 


T.J. Rao (1983c) has also given a class of almost unbiased product 
estimators in IPNS set up, which includes Robson’s (1957) Product estimator in 
SRSWOR as a special case. 


8.2.14 Double Ratio Estimators 

Suppose the study characteristics (y) and the auxiliary characteristic (x) are 
observed for a SRSWOR of size n both on current occasion and previous 
occasion as {y\ h x\ ,) and (yo h xo,) respectively, i = 1,._.., rL With usualjiotabons 
let yi, 3c i, yo and 3co be the sample means and Y\, X\, To and Xo be the 
corresponding population means. Define/'i = y\/x\, ro — yo/xo, R\ — Y\!X\ and Ro 
= Yo/Xo. Thus, we have 

Estimator of population_mean Y\:yid = (n/ro) RoX\ 

Estimator of ratio R\ = Y\iX\ : r\d~ (r\tr o) Ro- 
Estimator of ratio R = R]/Ro : rj = r\lro. 

As these estimators are based on r\/ro which is a ratio of ratios, these are 
called double ratio estimators. 

Approximate biases and mean scjuare errors have been obtained by J.N.K.. 
Rao and Pereira (1968) using conventional techniques. 

Keyfitz (Yates, 1960) first used N y\d to estimate the total labour force 
from sample data on current and past censuses of production andlabour force. 
J.N.K. Rao (1957) made use of r\d in forest surveys to estimate the ratio of 
volume of timber in two diameter classes. Kish (1968) used rd to construct 
index numbers measuring economic and social indicators. Smith (1966) 
discussed ratio estimator of ratios rd and applied it to estimate demographic 
parameters. Kish and Hess (1959) and Kish (1965) have made studies on 
difference of ratios r and r'. Kish, Namboodiri and Pillai (1962) have shown 
empirically that Bias(r - r')/S.E.(r - r') may be assumed to be negligible. 


8.2.15 Multi-stage Ratio Estimator 

Let X,/ be the value of auxiliary variable* corresponding toyth secondary unit 
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(second stage unit) of the ith selected primary unit (first stage unit), i . ,,N 

and j = 1,.. Mi. 

Define V,, = (MJM)Xij, 

1 m, 

V, = (Mi/M]X h where Xi = — ^ X ij » and 

Mj j = i 


V= — ZV,=X 
N 


Further, for study variable define 

U„ = Yy, U, =■■ ( Mt/M)?, and U=Y 

Consider a two stage sampling where n primary units are selected by 
SRS WOR from N primary units in the population and from the/th primary unit 
selected, a SRSWOR of m, secondary units are selected. Thus, for the two-stage 
sample define 


*/ 




_ 1 A 1 IX Mi 

v i=—L v ij=—2 J ^T x ij’ 

m i i = 1 / = 1 M 


1 A 1 !XMi 

U i~ ~ Aj IJ Yit* 

m i P\ mi J = ] M 


1/1 1 w 

v = -£ v / 

T /1 M 


»7=i 


n ~ 


Thus, the ratio estimator of the population mean in two-stage sampling is given 
by Yin 2 ) = ( u/v)X , which is a biased estimator. To the first order 
approximation, 

MSE{7 Rm )=(±-±}[S}+R*S}-2RS m ] 



where i) 5 2 = ^- £ (£/,-6' ) 2 , 



1 


TV-1 


fsiVi-v) 2 


i’=l 
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Suv - - Z(Ui-U)(Vi-V) 

(N-l) 


C 2 = 

O/y 


(M, - 1) 


w.j - y,) 2 , 


S “ = TTa — T\ Z % - and 
(Mi -1) 

s *y = Wt - r,) (x tJ - x,). 

As known earlier the estimator of population mean in two-stage sampling with¬ 
out use of auxiliary information is 

£ _ 1A. 1A Mi _ 

y u( 2 ) =u=- 2 J u i =~ 2 _ J ^=-y i 

n i= j n M 


1 1 


K(F„ (2)) =U-Aj 5 2 + _Ly(^| 

Defining p* = S U v/S u S v , p, - S lxy /S lx S iy , C bx = S v /X, C by = SjY 
C,x — SJXt, C\y — Sjy/Yi 

A 

It may be seen thatF/e (2 ) would be more efficient than Y U (i) if the sufficient 
conditions p b > 14 C bx /C by and p, > 14 CJC iy for all / are satisfied. 

Murthy (1967) suggested a chain ratio estimator of which is defined as 

# Jsr M i y v lsr M i i?lv 


where n = y, /x, 

A 

fc/?( 2 ) is a biased estimator and the bias may be negligible for large* and 
mfs. To the first order of approximation, 

ufr / 1 ^ _ 


MSE(K c , ( 2) )=^--_J^2 + ^ 2iS 2 -2RS uv ) 


1 sr( M t 


1 1 


,_* Y JVI i \ a 1 - 

nN Km) Im, m) (S v + R i S l 


™ ^MJ Vm f Mi) '■ 

where /?, = ?,./X, and « = Y/X 

Defining P* = and p, = Su y S J, we find 




Use of auxiliary information | 231 


MSE (Y r{2) ) - MSE (Ycr( 2)) 


= J_y 

nN V M 


2 ( 


\ 


vm,- 


__1 

~MjJ 


S?[(^-P,) 2 -W-P,) 2 ] 


A ^ 

Thus, Ycro.) is more efficient than F* (2) if each p, is nearer to Ri than to R. 

For further details the reader may refer to Sahoo and Swain (1983). The 

a a 

usual bias reduction techniques may also be applied tof/?( 2 ) and Ycr( 2 )- 


8.3 Regression Method of Estimation 
8.3.1 Linear Regression Estimator 

Sometimes data on study variable (y ) along with values on a correlated variable 
(*) are available and the plotting of sample values presents a scatter diagram 
which indicates the concentration of plotted points along a straight line not 
necessarily one passing through the origin. In such a situation it may be 
worthwhile to estimate the regression function from the given sample, provided 
that the assumptions as to linear regression analysis such as 


i) existence of linearity of regression ofy on x in the population. 

ii) constancy of residual variance ofy about the regression line and 

iii) infinite nature of population, are approximately satisfied. 


Thus, for an SRS of size n, the estimated line of regression under a linear 
regression model is 

T = a + px, 

where a and (3 are least square estimates given by a = y- bx and 
a _ ~x) _ , 

p ~c,-*)’ ~ b ' say - 


Thus, we may write 



= N[y + b(X-x)]. 

Also, Y = y + b(X - x). Y and Y are called linear regression estimates of 
population total and population mean and may be specified either asf^ g and 
YHex or as Ny re{( and y reg respectively. 
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Alternative Derivation of the Regression Estimator 

The least square regression line of y on x based on a sample is 

Y= y + b(x- x). 

This regression line passes through the point CP, x) as well as the point (YuegX). 

A 

Thus, the slope of the line is b = (YReg - FV (X-x). Hence, 

A _ 

YReg = y+b (X-x) 

Relation between Sample Regression Coefficient b and 
Regression Coefficient B in the Finite Population 

Let the assumed linear model for (Xj. Y,), i— 1,../V be 

Yj = a + PX + 

The least square regression line ofy on x is Y,(exp) = Y + B(X,-X), where 

B = Y(Y i -Y)(X i -X)/t(X i -X) 2 
/=i / '=i 

Define d, = Y,~ Y i(expf = Y, - Y- B(X,-X). We have, therefore, 

f j d i =0^ndf j d i (X i -X)=I(Y i -Y)(X j -X)-Bi:(X i -X) 2 =0 
/=■i i-i 

, {z(yi-y)(*i-x)} 

Now, b= -- —pi - 

{£(*,• -x) 2 } 

Y\Y+B{x i -X)+d i '\{x i -x) 

_ Zy i (x i -x) _ i= i L _ 

£(*/ —x ) 2 ^ 2 

2 /Xi-x) 

i =1 

=B + '£d i (x i -x)/Y i (x i -x) 2 

/=i i=i 

Cochran (1963) has shown that (6 - 5) is of order 1/V«. 

Bias of Linear Regression Estimator 

«A A 

Bias (F/^) = E(YReg) - = - cov (6, x), 

which vanishes if the relationship between^ and* is linear. 

To derive an approximate expression for bias is to expand F Reg by 
Taylor’s series expansion and to calculate approximate bias by taking the 
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expectation of leading terms and using the formulae (due toSukhatme, 1944) 

N{N-n) _ P 2 t and 


cov (x,s xy )= 

COV (1 ) = ■ 
1 


{N-\)(N-2) n 

N(N-n) IU 30 
(N-\)(N-2) n ' 


where p,, = 2 (X, -X) r {Y,~ Y) s . 

Hence, to first order of approximation 

/v / U \ D 

Bias (Y Reg) = -- -— [(P2l/pil) - (P30/P20)]. 

(N -2) n 


Mean Square Error of Y re g 


A 

Defining Y mg = y + b(X-x), where b is the least square estimate of B, the error 

in Y,cg to estimate Y is given by Y reg -Y=y-Y + b{X-x). 

If B replaces b in the formula for regression estimate, that is 

As. 

y reg = y + B {X — x\ 


the error committed in such in approximation inSRSWOR is (b - B ) (X- x). 
This is of 0(1 In) as both (b - B ) and (X-x) are of order 1/V«. In large samples, 
this error may be neglected and we have 


MSE (Y n g) = 


V(Y reg ) = 


(1 -/) 
n 


SyH 1-P 2 ). 


Assuming that the Yfs (/ = 1,.. N) are a random sample from an infinite 
population under the model 

Yj — a + $Xj + e h 


where E m (e,\X,) = 0, E m (e, 2 \\X,) = a e 2 = a y 2 (1 - p 2 ) and E m (e i e J \X l ,X J ) = 0 (/ * j). 
Cochran (1942) showed that to 0(1 In 2 ), 


V(Y Reg ) = 



ro-/)i 

C 2 y (1-P 2 ) 

1+—— + 

' 2G 2 > 

% 


n 


(»-3) 

U 2 J 

7 


where G 2 k 2 x I is Fisher’s measure of relative skewness of the distribution 
of x. 
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Large Sample Comparison of K r eg with Ratio Estimate Yr and Simple 
Expansion Estimator y 


Large sample approximate expressions for V(Y reg ) and V(Yr) are 


V(Y*e s ) = 


-fJ-/ 


n J 


SH1-P 2 ) 


HYr ) S 


n 




Further V(y) = [(1-/)/«] S'/ 

Ymg is more efficient than y unless p = 0. Further, the regression estimate 

^ A 

Ymg is more precise than Yr if (pS^ - RS X ) 2 > 0, that is ( B - ^) 2 >0, which is 
always true. If B = R, that is, the relation between F, and X, is a straight line 
through the origin, both the ratio and regression estimators have same variance. 

In small samples it is likely that the ratio estimator may have smaller 
variance than the linear regression estimator. 

Further, the larger mean square error of the regression estimator in small 
samples compared to ratio estimator may be sometimes due to higher variance 
rather than to higher bias. The regression estimator scores over ratio estimator 
as regards efficiency when the intercept in the linear regression relation 
explains a larger part of the variation in y. 

8.3.2 Difference Estimator 


A A 

Let Y and X be unbiased estimates of YandX following any sampling design. 

A /A A 

If Y and X are correlated, one may think of improving^ by adding to it a zero 

function X - X , that is E(X - X) = 0. Thus, we may define the difference 
estimator 

V D = V + X(X- J), 

A 

where X is a constant. Yd is unbiased, whatever may be the value of X. Further, 
V(f„) is given by V(Fd) = K(F) - 2X cov ( j,F) + X 2 V( j). 

A 

The optimum value of A. is obtained by minimizing V(Yo) with respect to 

X and then \ ()pl = {cov {X,Y)}!{V{X)} = 5, say. 

The optimum difference estimator is Y D (o P t) = Y + B(X - X), where B is 

the regression coefficient of Y on X, and hence may be called regression 
estimator. Further, 

Vo P '(D)=V(7)[l- 9 2tf,7)] 
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Thus, we may write 

V(V D ) = V(Y) [l - P 2 (x, F)] (i + 5 2 ), 

where 

6 2 = [{p 2 (I, F)}/{ 1 - p 2 (i, F)}] {(B - If IB 2 ) 

Thus, the minimum variance of the difference estimator is attained when X = B. 
As B is not known in practice, it is necessary to use a good guess value of B 
from the past experiences so as to obtain a near optimum difference estimate. 
In simple random sampling without replacement 

Vapi (F/j) = [( 1 ~f)/n\ S y 2 (1-p 2 ) 

If B is estimated from a SRSWOR of size n by 
b = (E (y, - y) (x, -*)}/{£ (x, -x) 2 }, 
then the resulting difference estimate is 

te 8 = y + b(X-x\ 

with large sample variance equal to 

V(L g )= { ^Sj( 1-P 2 )=v 0pl (f D ) 

n 

Further, Est. K op ,(F D ) = X-L. f [( v , _ y)-b( Xi -x)f 


8.3.3 Estimators of Variance of Linear Regression Estimator 

Under simple random sampling, the leading term of theMSE or variance of>w 
is given by 


V (yreg) = 


1 -/ 1 
n N -1 



"hich ta ‘he residual of Y, ,o the regression 
' Y ^X/ -X), B being the finite population regression coefficient ofy on x. 
The commonly adopted approximate variance estimator is 

0 -/) 1 


v reg ~ 


n 


(n~ 2) 


-Id 2 


A 

where d, (y, - y)~ b(x,-x) is the sample residual corresponding to/th observ- 
samples ^ Sampl6 ‘ Vreg iS biased ’ althou gh the bias is negligible in large 

A ratio-type adjustment as proposed by Deng and Wu (1987) gives 
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Vg — (X/xy Vrcg 

Minimizing the leadingjerm in the expansion ofMSE(v g ) with respect to 
g gives g„p, = {SjZX}/{S x W}, Which is the population regression coefficient 
of Zi/Z over XJX and Z, = d, 2 , the residual square. Thus, it is appropriate to 
suggest an estimator in the proposed class as 

v' g =(X/x)8v reg , 

where g is the sample estimator ofg^. 

Another variance estimator, derived from standard regression theory 
(Royall, 1970) is 


Vl = V„ 


eg 


1 + 


(x-xy 


(!-/)« 1 £(*,■ ~ x * 2 


1=1 


The jack-knife variance estimator 

^ “("-D ld, n -f Q ) 2 
n ./' = ! 

/v 

where T 0J Js the regression estimate based on sample size {n - 1) deleting they'th 
unit and 7).) is the average of Tqj. 

Deng and Wu (1987) have made an empirical study on the behaviour of 
variance estimators v 0 , vi, v 2 , v g , v/, and vj as regards MSE and bias. The 
empirical findings show thatv g has small and often the smallestMSE among all 
the estimators under comparison, whereas v.j has the largest MSE. 

As regards bias, all estimators except vj are downward biased. The esti¬ 
mator v., is always upward biased and v L has a smaller bias than v 0 , vi, v 2 and v 
Royall and Cumberland (1978a) have proposed some bias-robust 
variance estimators including Jacknife variance estimator for linear regression 
models. Royall and Cumberland (1981b) have examined the usual variance 
estimator for the linear regression estimator of finite population totafunder 
prediction models and have showed that the biases inv refr and v; when constant 
variance condition in the model fails can be serious. Although failure of “1 
constant-variance model can bring about a serious bias in „ h ! 

W* but the failure of the linear regression can introduce a bias in T "rov^H 
and Cumberland (1981b) have also made an empirical studv of Z l*' R > 

results concerning variance estimators wl” tb rf ! ‘“1 

populations. ine nel P °* some natural 
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g 3 4 Unbiased Regression Estimators 
We have already seen that the linear regression estimator 
yrcx y b (X x) 

is a biased estimator and the exact bias is given by 
Bias (yng) = - cov ( b , x) 


This bias is of 0(1 In) and can be reduced to lower order by adopting bias 
reduction Techniques adopted in case of the ratio estimator. Williams (1961) 
has suggested a novel method to generate unbiased regression estimators. The 
sample regression coefficient can also be made unbiased by making use of 
Midzuno type of sampling scheme suggested byNanjamma, Murthy and Sethi 
(1959) in which the first two units say zth and y'th units, are selected with 
probability of joint selection proportional to (A r , - XJ) 2 where X; and Xj are the 
values of the auxiliary variable associated with zth and y'th units of the 
population and the rest (« - 2) units in the sample are selected out of the 
remaining (N- 2) units in the population with simple random sampling without 
replacement. Singh and Srivastava (1980) followed such a probability scheme 
called by them sampling scheme I, which makes the linear regression estimate 
ymi completely unbiased. Thus, under such a sampling scheme 


P(s)=Sx 


nJ 



and 

E(y reg ) = jj-I.ly+b(X-x)]jL 
\nj 


S} 


e [(7 + y)5, 2 -3c' % ], 


(where E' stands for expectation corresponding to simple random sampling 
scheme, X, =X,-X and Y\ = V, - Y) 


N 


= Y + 


N(N- l)5. v 2 


J](Y!Xj-Y;X' i X f J )=Y, 

i*j 


as the second term can be proved to be equal to zero. 

Alternatively, E(y reg ) = Y + (IAS* 2 ) [cov' (y,s x 2 ) - cov' (r,^)], where 
co-variance is based on SRSWOR scheme. 

As cov' ( y ; ^ 2 ) = cov ^ fa = (\/„ ~ [/N)K 2 \, where K 2 \ is_the (2, l)th 
cumulant ofx and y, the second term vanishes and thusisO-Veg) ~ Y■ Singh and 
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Srivastava (1980) have derived an approximate expression for variance, which 
to 0(1/«) is [(1 -j)!n\ S y 2 (1 - p 2 ), equivalent to approximate mean_square error 
of biased linear regression estimator. An unbiased estimator of V(yreg) is 


Est. V{y reg )= y 2 


1 S 2 


reg 


Nn s i 


N -1 A 


i = ] n L i±j 


An alternative sampling scheme II suggested by Singh and Srivastava 
(1980) is as follows: 

Select the first unit with probability for the /th unit proportional to (X, - 
X) 2 and select (n - 1) units from the remaining (V - 1) units by simple 
random sampling without replacement. The probability of selecting a 
sample s following this scheme is 

« _ - 

/>M=2>-*) 2 / ,!(*,-* ) 2 

£1 / \n-Vfci 


Singh and Srivastava have shown that 


._ n{N-l) 
s N(n- 1) 


7 + 


(x-x)Xyi(xi-x) 

i=1 

i=i 


is unbiased under sampling Scheme II, provided the number of 
population units with X t = Xis less than n. However, as shown by Singh 
and Srivastava for large samples, t* s under sampling Scheme II is less 
efficient than y re g under sampling scheme I. 


8.3.5 Asymptotic Normality of the Regression Estimator 

Write y ng = y + b(X-x). Define the residual about the finite population least 
square regression line 

df = Yj-Y-B (Xj -X) 

We have £dj =0and £ d f (X f -X)=0 
i= 1 /'=! 


Further, S<? = £ d 2 (N-\)=S 2 (!- p2 ). 


Now, embed the given finite population in a sequence of populations 
{ Vs,), indexed by v where n v and N v both increase without bound as v -> ». 

^cemem 2 (SCOt * a " d W "’ ' 98 ' ): Under Simple random sampling without 
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_ A 

{■>/(^v(reg) — 

W(l“/v"j s w Vo-pi"} 


L 

-> 7V(0,1) as v -> ao and provided 


i) {</ vj } v> ./satisfies Lindberg-Hajek condition 


lim 

V—» CO 


(rfv/-£» u ) 2 


r ^ 8) (W v -l>S 


vrf 


= 0 for any 8 > 0, 


where 7^(5) is the set of units in U v for which 
{d v) - Ov}/{ V(1 -/ v ) S»4 > 8V«v 
ii) Defining Z v/ = (X v , -Xrf/S 2 ^ and W vj = tPvj/S 1 ^ 

V(Z v ) and f(ff v ) -> 0 as v -> oo. 

/V 

Proof: Write Y V (reg) - K v = d v - (8 V - B v ) (3c v - Xv), where d v is the sample 
mean of d v f s. Now 

{V/7 v ( Y v(reg) — Y v }/{V(l ~fv) Svd} 

= W« v rf v }/{V( 1 -/v)5v4 - ft B 2 ft, 
where ft = (Va? (x v - X v )}/{V(1 - fv)S vx }, B 2 = ftv* / s 2 v * and 

^v; v/' — ) 

£ _ 

3 ( n-\)S vd S vx 

L 

The first term —> N (0,1) provided the Lindeberg-Hajek condition enunciated 
earlier is satisfied. B\ is stochastically bounded since£(ft) = 0 and F(ft) = 1, 
ft -> 1 as v oo. ft converges in probability to zero as v -> oo. Hence, the 
theorem is proved. 


8.3.6 Regression Estimators in Stratified Random Sampling 

. As in case of ratio estimators in stratified random sampling, we may have two 

types of estimators: a) separate regression estimator (b) combined regression 
estimator. 


The separate regression estimator 

Assume that there are k strata and the weights are W \,..., f V k . 
— k 

^■'('■eg) T/(reg) » 

/ =1 

where y, fre g) =y ( + bi (X, - x t ) and b, = s, xy /s 2 i x 
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The combined regression estimator 

* 

yc(rufi) ~ y.st "F b (X/ — A'/), 


where 


b * 




1 

N, 



To first order of approximations, 

MSE(F s(reg) )=2>, 2 ' 1 

/=1 


l 


and MSE()' r(reg) )= 


r=l 


V n t 

M_ 

\n, N t 


Nt 

1 


|sja-P?) 

1 S} y (1-p* 2 ) 


where p, and p* are defined in relation to b t and b* respectively. 

Thus, subject to approximations involved, the separate regression 
estimator >s more efficient than the combined regression estimator unless the 
regression coefficient P/ is same from stratum to stratum. 

8.3.7 Alternative Regression-type Estimators 


For any general sampling design let us consider a class of regression-type 

A A A 

estimators of population total Y such as Yk = k\ Y + ki_ (.X-X), where k\ and kj 
are constants. Some special cases are 

i) Y'k =Y + k (X-X ), 

which is a well known difference estimator and the optimum value ofk is 5, the 
finite population regression coefficient ofy on x. 

ii) Y" k = kY +(X- X), 

where k is a constant and the optimum value of£ (neglecting the bias) is 
lY )pt = cov (Y,X)/V(Y) and min V(Y" k ) = V(X) (1 - p 2 ). 

A' 

Thus, Y"k is superior to usual regression estimator if V(X) < V{Y). 

For the general regression type estimator Y k , T.J. Rao (1993a) has shown 
that the optimum values ofk\ and ki minimizing the MSE are 

k 2 = k\ B, where B = cov (Y,X)/V(X) 

k\ = [1 + C y 2 (1 - p 2 )]-i, where C y 2 = VtyyY 1 . 

The minimum MSE of Y k is found to be equal to 

min. MSE (Y k ) = V{Y) (1- p 2 ) [1 + C y 2 (1 - p 2 )]-i, 
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showing thereby that ft is better than the conventional linear regression 

estimator which has large sample variance V(Y)(l - p 2 ). 

Kaur (1985) suggested an estimator for any sampling design 

y’ km = k,Y + B(X-X), 

where k, is any constant andfl is the sample regression coefficient. Minimizing 
the large sample variance ofr'in), the optimum value of*i is found to be p . 
Thus, min.K(f Vll) = p 2 HYlieg)- 

A much older contribution relating to innovations so far discussed is due 
to I.S. Rao (1978) who considered an estimator of Y in case of SRSWOR as 

A 

Yis = K\ y + Kjx 


The unbiased condition (ATi — 1) + Kj - 0 reduces to 

a 

Yis = K\ y+(l -K\)Rx 


The minimization of Y(Y/s) gives 

(R 2 S 2 -RS V ) 


K\°P' = 


and V(Yi S ) = 


(S j. +R 2 S; - 2P. 2 S xy ) 

(1 -f)y 2 (^P 2 )} 


n 


{Cl +C? -2p C y C x )} 


Thus, Yis with optimum K\ is more efficient than conventional regression 
estimator if p < 14 (C/G). 


8.4 Estimators with Multi-Auxiliary Variables 

When data on multi-auxiliary variables are available, the question arises about 
the manner in which these information would be used at the estimation stage so 
as to provide an efficient estimator of Y compared to simple expansion 
estimator and estimators using single auxiliary variable. Some of these 
procedures are given below with the help of two auxiliary variablesxi and * 2 , 
assuming simple random sampling. 

8.4.1 Difference Estimator with Two Auxiliary Variables 

As a generalization of difference estimator using single auxiliary variable, a 
difference estimator of under SRSWOR using information onjci and *2 may be 
defined as 

A 

Ynul = y + h (X\ -x\) + \2 (X 2 -X 2 ) 
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where ,\i and \ 2 are pre-assigned constants and the population meansX, andX 2 

of auxiliary variables are known. Y m d is an unbiased estimator of Y and the 

optimum values ofX.i and X2, which minimize V(Y m d) are B\ and B2, which are 
finite population partial regression coefficients 0 fy onxi eliminating the effect 
of X 2 and v on *2 eliminating the effect ofai. Thus, for optimal choice of B\ and 

a 

B 2 , the optimum variance of Y m d is 



n N J 



where R is the multiple correlation coefficient between^ and (x\, x 2 )• for the 

practical use of Y m d with optimum Vs the survey sampler may use the guessed 
values of B\ and B2 from past surveys or alternatively estimate these values 
from the sample. Thus, we have the regression estimator based on two auxiliary 
variables defined as 


Y m reg = y + B\ (X\ - X\) + B2 (^2 “ X 2 ) 

Ymreg is a biased estimator of with bias likely to be negligible for large samples. 
To first order of approximation 


V(Y, 


mreg 


V n N J 


It may be verified that the multivariate regression estimator with two 
auxiliary variables is always more efficient than the linear regression estimator 

with single auxiliary variable. 

Raj (1965) suggested a weighted difference estimator 

Y W( j =Wyji + (1 - W) ydi , 

where yai = y + h (X\ - * 1 ) and Vq = y + V ( X 2 - * 2 ), X\ and X 2 are known 
constants. The optimal weight is given by 

w )-COv( 5 >rfl , 3 >rf 2 )] 

° pl )+V(yi 2 )- 2cov (5> rf ,, y dl )] 


Further 

v ( y ^_ v iyq\ ) v (yd2)-[<x>v(yd\,ydi)] 2 
° pt W V(yd\ ) + y(Yd 2 )- 2.cov (y d \, y d2 ) 


=v(y d >)- 


[V{yd\ )~™v(yd\,yd2)] 2 

V(y d \ ) + ^(Td 2 )- 2 cov(y rfl ,y d2 ) 


showing thereby that the weighted difference estimator with two auxiliary 
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variables is always more efficient than the difference estimator based on single 
auxiliary variable. 

The same result can be generalized to show that the weighted regression 
estimator with two auxiliary variables is more precise on single auxilia^ 
variable and naturally more efficient than the simple expansion estimator y. 
Further, 

Von' (Vwd) - Vapt <7md) > 0 , 

showing thereby that the multivariate difference estimator is always more 
efficient than weighted difference estimator as suggested by Raj (1965). 


8.4.2 Multivariate Regression and Regression Type Estimators 
B. Ghosh (1947) has suggested an estimator of the form 

r c =?+£p,(*,-*,), 

1=1 


where p, is the estimate of regression coefficient ofy on x, (/ = 1,. . ., k). 

Suppose the whole population is divided into k classes, z'th class having N, 
units. Let Y,, be the value of/' th unit in / th class (/ = 1 ,..k and j- 1,..., Ni). 
The units in the ith class have same value of the auxiliary variables^,], Xa, 

X ip ) where p is the number of auxiliary variables. It is assumed that in reapeated 
samples with replacement the number of units having value QCn, .. X ip ), say, 
«/_(/ 1, ..k) is fixed. Let Sn ( - = n. The assumed model is T y = T+B' (X, 

-X) + e, h i - (1, .. k\__j =_(1, ..., Ni), where B is a column vector of 
constants (5i, # 2 ,..., Bp)',X\ i -Xis a column vetcor (Xn-X\, .. .,X ip -X p )' and 
X is a column vector (X\, .. X p )' with 

Y = N ~' 1 1 Y v “ d = W- 1 1 f; (a = 1. p). 

, = 1 / =1 1 = 1 7=1 

Further £(e ;/ 1 /) = 0 and E(ej, 2 | /) = X (constant). 

Define an estimator of population mean? as F = J+ B' (X-x), where 3c is the 

sample mean vector'. Minimizing V(?) to find optimum value of B and 
substituting the estimate B of ft* from the sample. Shukla (1965) has 
suggested a multivariate linear regression estimator as 

A 

V mlr~ y + B' (X-T), 

Which has the same form as Y„ ng stated earlier. Under the model £(F„,,) = f. 
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Further, V(Ymir) 




[\+n(X-x)'s- ] (X-x)] 


where R 2 is the square of the multiple correlation coefficient between y 
and (x\, X 2 ,.. x p ) and s is the matrix of corrected sum of squares and sum of 
products from the sample. 

To find average value of V(Y m ir) over all possible samples let us_assume 
thatjc’s have ap -variate normal distribution. Thus,«(«- l)(X-x)' s _l (X- x) is 
distributed as Hotelling’s T 2 - distribution. Hence 


E[(Y m ir )] = 


S 2 (l-/? 2 ) 

n 



P 

n-p-2 


±. S5(i- P 2 )f i 1 

For p =1, E[(Y mlr )] = r - 1+ — 

n |_ n_3 _ 

Srivastava (1965) has proposed a weighted linear regression estimator of 
the population mean Y to make use of/?-auxiliary variables x\, X 2 , ■ . x p as 

n 

Y 

1 wreg fr i 1 1 

1=1 

where tj = y+ bi (X, - 3c,) and bi is the sample regression coefficient ofy on x,. 

8.4.3 Multivariate Ratio-type and Product-type Estimators 

When two auxiliary characteristics jci and X 2 are available and positively 
correlated with study variable^, Olkin (1958) proposed a weighted estimator 

Y w r = W Yr] + (1 - W)Yr 2 , where Yr\ = (.Y/X,)X h i= 1,2. 

A 

Assuming the bias to be negligible, the minimization of V(Y w r) gives the 
optimum Was 

T „ _ (Y22 ~V\l) 

W opt —- 

(V u +V 2 2-2V n ) 


\ A A A 

where V\ \ = V{Yr\), V 22 = Y(Yr 2 ) and V \2 = cov (f/?i, Yr 2 ). 


Further, V(Y w r) = 


(VuVn-Vb) 

(Vn+V 22 -2V 12 ) 


It may be seen that V(Yr\) - V„ pl (Y wR ) > 0, showing thereby that the Olkin’s 
ratio type estimator with two auxiliary variables is more efficient than the 
simple ratio estimator, based on a single auxiliary variable. 

For SRS WOR, assuming Cxi = C X 2 = C x that is the coefficient of variation 
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of x\ and X 2 to be the same, a sufficient condition for Y w r to be more efficient 
than is easily seen to be 

(Poi +P02 ) > y 2 
<CJ (I + P 12 ) 

where po/ is the correlation coefficient between y and x t and P 12 is the 
correlation coefficient between xi and * 2 . 

Further, Olkin’s weighted ratio type estimator with {p + q) auxiliary 
variables is always more efficient than the one based withp-auxiliary variables. 

Olkin’s method is accompanied with major difficulties that the weights 
are functions of unknown population parametric functions and therefore^are 
ought to be estimated from the sample for any practical use of this estimator. 
Shukla (1966) has proposed an alternative estimator off as 


* 1 

f " \ 

/ 

f ” \ 

II 

,5? 


/ 

I** 


\/=i ) 

/ 

\i=\ y 


p 

where ^ = 1 and has shown that to 0(1 hi) it has same variance as that of 

1=1 

Olkin’s multivariate ratio-type estimator. 

Singh (1967b) has suggested a weighted multivariate product estimator of 
Y as 






;=i 


John (1969) has put forward an alternative estimator Y' w p = yLW^J 

UViX,. Srivastava (1965) and Rao and Mudholkar (1967) have built up a 
weighted multivariate estimator, where some of the auxiliary variables are used 
for ratio method of estimation and others for the product method of estimation. 

8.4.4 Ratio of Regression Estimators 

In the presence of two auxiliary variabelsxi and *2 when the study variablesy is 
positively correlated with x\, it may be worth while to make a regression 
^“toiem to both* and*, with the help ofx 2 to estimate the population ratio R, 

a = £ = [P+M£2 -Jj)] 

X 1*1 + &12 (X 2 — *2 )J 

ere bo 2 and b]i are regression coefficients of* on x 2 and x\ on xi respectively. 
The bias of*, ,s of 0(l/„). To first order of approximation and denotingCo and 
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Ci as the coefficient of variations ofy andxi respectively, 
MSE(tf )=K(*, )= 


1 -/ 


n 


Y 2 [C 2 (l-pQ 2 ) + C 2 (l-pf 2 )-2C 0 Cj (Poi -P 12 P 02 )] 

( 1 -/)' 


= V(y/x 1 )- 


n 


y 2 [C0P02 — ^1P12 J > 


showing thereby that it pays in terms of increase in efficiency by the regression 
adjustment. 

8.4.5 Combination of Ratio and Product Estimators 


Let us suppose that two auxiliary characteristics*! and *2 are such that the ratio 
estimator using *1 and the product estimator usings to estimate population 
mean Y happen to be less efficient than the simple unbiased estimator y. The 
question arises as to how both*i and *2 should be used in the estimator, which is 
simple in form like the ratio or product estimator and is more precise than the 
simple mean y. Thus, for SRSWOR, Singh (1967a) has proposed a 
ratio-cum-product estimator 

f s \ = y{X\lx\) (X 2 /x 2 ) 

where y is positively correlated with*i and negatively correlated with* 2 , and a 
ratio-cum-ratio estimator 

Vs2 = y(X\/x ] )(x 2 /X 2 ) 

A A 

wheny is positively correlated with both*] and * 2 . Fi_and Y x2 are constructed 
by product and ratio adjustments to ratio estimator y (X\fx\). 

Finally, Singh (1967 a) has also considered a product type estimator such 

A - __ 

as y v3 = (•y x] x 2 )/(X\ X 2 ), wheny is negatively correlated both with*! and* 2 . 

A A A 

Ys\, Y si and Y x 3 are biased estimators. To the first order of approxi¬ 
mations, 

MSE(K.vi) = MSE ( Yn\) + Y 2 [C 2 2 + 2 C 0 C 2 P 02 - 2 C 1 C 2 P 12 ] 

MSE(F,2) = MSE (F/ei) + Y 2 [C 2 2 - 2C 0 C 2 p 0 2 + 2CiC 2 pi 2 ] 

A A 

and MSE(F.3> = MSE (Y p ) + V 2 [Ci 2 + 2CoCipoi + 2 C 1 C 2 P 12 ] 

A A 

where Yr\ = (y/x\)X\, Y p = yx\!X\\ 

C, (/ = 0, 1,2) are the coefficient of variations ofy, x\ andx 2 respectively; po/ is 
the correlation coefficient betweeny and*/ and pi 2 is the correlation coefficient 
between*i and* 2 . 
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Swain (1973) has considered Singh’s ratio-cum-product type estimators 
with several auxiliary variables in a two stage sampling design. Sahoo and 

Swain (1980) have derived unbiased estimators ofF.vi by Hartley-Rossization 
and have shown that these estimators happen to be particular cases of William s 
(1961) and Mickey’s (1959) generalized unbiased estimators. 

When the relation between y and x\ is linear not passing through the 
origin, but the relation between y and X 2 is linear passing through the origin, 
Mohanty (1967) and Swain (1973b) proposed regression-cum-ratio and 
regression-cum-product estimators such as 

A A ^„ ■ 

YRex -r = [y + bo\ (X\ -*i)] (X 2 /X 2 ) and YR eg -p = [y+ bo\ (X\ -xi)] (X 2 /X 2 ), 
where 601 is the sample regression coefficient ofy on x\. 

8.4.6 General Classes of Estimators of the Population Mean 
Using Multi-auxiliary Information 

General classes of estimators estimating population mean Y using multi- 
auxiliary information and satisfying certain regularity conditions (Srivastava 
and Jhajj, 1981,1983) and assuming the advance knowledge o£A"i,..., X p may 
be defined as 

i) Y h = yh (x\/X\, .. ., Xp/Xp), ii) Y, = g(Y, X \,..., X p ), 

where g is a much wider class than h. 

Such classes assume h and g to be continuous functions having existence 
of certain order partial derivatives which may not be Appropriate to survey 
sampling situations. However, such assumptions help Taylor’s series expan¬ 
sion to get terms which might best approximate the estimator when bothiV and 
n are large. 

Existence of prior information as to other parametric functions concern¬ 
ing auxiliary characters such as variances or coefficient variations, etc. may be 
sometimes be exploited to obtain more efficient estimates and in such cases the 
classes of estimators may be defined accordingly. For instance, in a bivariate 
situation with knowledge about& 2 or G 2 , a class of estimators may be defined 
as 

A ___ Ai 

Yh = y h (x/X, s x 2 /S x 2 ) or Y h = yh (xJX, c* 2 /C, 2 ), etc. 

8.4.7 Asymptotic Distribution of Multivariate Regression 
Estimator 

Fuller (1975) has established the asymptotic normality of multivariate regres¬ 
sion estimator when the finite population is regarded as a sample from a super 
population subject to certain mild conditions. Following the model free 
approach of Scott and Wu (1981) to show the asymptotic normality of ratio and 
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. rharl „ | | 984 ) has shown the asymptotic normality 0 f 

regression estimators, Ch g ( simple random sampling without 

multivariate re ^ s '°"/"^Sek condition and some conditions 

-sr. -—--**—« 

uniformly boundedness of certain correlations. 

8.5 Finite Population Variance Estimation Using Auxiliary Information 

Similar to the estimation of finite population mean, efficient estimators of finite 
population variance may be constructed using one or more auxiliary 
in either ratio or regression form. Although the estimation of finite popu ation 
variance arises while estimating the variance of the estimate of finite popu a 
mean or total, it was only through the work of Evans (1951) and Liu (19 J 
separate attention was given to estimation of finite population variance, whic 
is no doubt an important parametric function of interest in sample surveys. 
Encouraged by the theoretical framework of Liu (1974) for the estimation of 
finite population variance in a general set up, Das and Tripathy (1978) 
suggested ratio and difference methods of estimation using either known 
population mean, variance or coefficient of variation of the auxiliary variable, 
Srivastava and Jhajj (1980) have suggested a general class of estimators for 
estimating the population variance with the use of auxiliary information. Isaki 
(1983) have proposed variance estimators under several sampling designs and 
has compared them when auxiliary information is not available. 

8.5.1 Ratio Estimator of Finite Population Variance in SRSWOR 

With usual notations define the ratio estimator of S y 2 , as 

S 2 yR = {Sy 2 /Sx 2 ) Sx 2 . 

Assuming the validity of Taylor series expansion 

V(s 2 ) cov(s 2 ,s 2 ) 


) = S l 


1 + 


S} 


S?S 2 y 


A 

to 0(1 In). Thus S 2 y n is design biased and the bias may be assumed to be 
negligible in large samples. Further, to 0( 1 /«) 

MSE (S 2 yn) = V(S 2 yR) = [V(s y 2 ) + « 1 2 V(s x 2 ) - 2R\ cov (s y 2 , s, 2 )] 
where R\ = S y 2 /S x 2 . 

In SRSWOR V(s y 2 ) = or, p 4 (y) - 012 ^ 2 2 (y) 

^Sx 2 ) = 0Ljfi4(x) - a2fi2 2 (x) 

and cov (s y 2 , s x 2 ) = aip 2 2 0,^) - a 2 p 2 (y) p 2 W, 
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(N-n)(N-\){Nn-N-n-\) 

where 

( N-n)(N 2 n-3N 2 +6N-3«-3) , 

and 012 = n(n-\)N(N-2)(N-3) 

an d are 2nd and 4th corrected moments respectively andp 2 2 is the (2,2)th 
bivariate moment. 

In case N is very large compared to n or in SRSWR, the approximate 
expression to 0(1 In) for MSE ( S 2 y R ) reduces to 

MSE (S» = (o /In) [p 2 (y) + p 2 (x) - 2p22 (y,x )] 
where P 22 (y.x) = h 22 0',jc)/|J2 O') H 2 W and a/ = H 2 <»- 

A 

Now, S 2 y n would be more efficient thans v 2 if 


p(^ 2 , s x 2 ) > C/ 2 ) (C.V. of s x 2 ) / (C.V. of s 2 y ). 


Under bivariate normality of (y, x), S 2 y R would be more efficient than s 2 y if p 2 > 
1 / 2 , or | p| > V1/ 2, where p is the correlation coefficient between;/ and jc. 

Swain and Mishra (1994) have considered the asymptotic normality of 
S 2 y R under certain limiting conditions. 

The ratio estimator ( s y 2 /s x 2 ) S 2 can be generalized to a multivariate ratio 
estimator in the manner considered by Olkin (1958). Thus, Isaki (1983) has 
shown that if y, x\, x 2 , . . . x p follow a equi-correlated (p + 1) variate normal 
distribution, Olkin’s weighted multivariate ratio estimator 


s =E w > 

/=i 


f s 2 .^ 
y‘ 

x xi y 


S xi with X w * = 1 is more efficient than 

/=! 


s>- 2 ,ff IpI > (p + 1) /2 > where p is the correlation between any two variables. An 
alternative multivariate ratio estimator with p auxiliary variables may be 
suggested as J 



where W* are suitably chosen constants with ZfV* = 1 


8.5.2 Unbiased Ratio Estimator 

( sr ct * T" 

equal probability without replacement from the remaining (V-2) units of th, 
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population. The probability of selecting a samples is thus 


P(s)=s? / 


V n) 



Under such a NMS sampling schemed V becomes design unbiased. Swain and 
Mishra (1992) have shown that to terms of 0(l/«), S 2 y R under NMS sampling 

A, 

scheme has the same variance as S 2 y R under SRS WOR. Further under bivariate 
normality and to terms of 0(l/« 2 ), assuming N to be large compared to «, 

MSE (S 2 yR | SRSWOR) - MSE <S 2 yH | NMS) 

= (S/m 2 ) (1 - p 2 ) (3 - 2p 2 ), 

, A 

which is always positive and thus showing the superiority of S 2 y R under 
Midzuno type Nanjamma-Murthy-Sethi sampling scheme. 

Swain and Mishra (1992) have suggested a Hartley-Ross type unbiased 
estimator of S 2 y n. 

Swain and Mishra (1992) have also used Quenouille’s technique to obtain 
an almost unbiased ratio type estimator of the finite population variance. They 
have shown that Quenouille’s technique of bias-reduction not only reduces the 
bias to 0(1 In 2 ) but also reduces the variance to 0(l/« 2 ) under the assumption of 
bivariate normality. This feature was also observed by Durbin (1959) while 
dealing with ratio estimator of the finite population mean. 

8.5.3 Regression Estimator 


Isaki (1983) has considered a multivariate regression type estimator to estimate 
the finite population variance o> 2 in SRSWR. For two auxiliaiy variables, 
Isaki’s multivariate regression type estimator is given by 

G 2 y reg ~ S 2 y + A,] (<J 2 x1 — S 2 *i) + A.2 (CJ 2 *2 ~ S 2 *2), 


where A.i and Xi are suitably chosen constants. MinimizingF(a 2 reg )for choice 
of A.i and A .2 the optimum variance reduces to 



where y, x\ and xi follow trivariate equicorrelated normal distribution with 
correlation coefficient p. 

Swain and Mishra (1990) have suggested a modified regression type 
estimator 


a s y 2 + A.*i (crVi - s x i 2 ) + (<j 2 x2 ~ s 2 x 2 ) + X 3 * ( g x]x2 - s x]x2 ), 

where 0 * 1*2 and 5 * 1*2 are the population and sample covariances respectively. 
The optimum variance ofo'Vs «s given by 
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^ 2 ;,«)=( 2/ ") ct ?' 


4p 4 

(1 + P) 2 J’ 


A w 

which is always less than V(o l y rc^)- 


8.6 Estimation of Finite Population Mean Vector 
Corresponding to Several Characters 

In most of the large scale sample surveys it is a common practice to study . 
several characters simultaneously. These characters are, individually, import¬ 
ant in their own right, but some of them may be of less importance compared to 
others. The problems arising out of the study of multiple characteristics, if the 
vector of means are to be estimated, are 


a) optimum allocation of sample sizes to different strata. 

b) the choice of suitable set of auxiliary variables to act as a measure of size 
for selecting the units with varying probabilities. 

c) existence of best unbiased estimator of vector of population means. 

d) the choice of optimum sampling strategy for estimating the parametric 
vector under suitably chosen super population models. 


Let (7 = (1,..., AO be a finite population of given N units and {Y,,} values 
corresponding to j th observation of fth character, i = 1,. .., p and 
For a SRSWOR of size n, an unbiased estimator of the mean vector 


Y = (Yu ■ • ■ Y p y is given by y = (y u . . y P )\ 


where 


1 " 

yi =- Z yn 


The mean square and product matrix of error is 

-t 


MSPME (y) = -(1 - f)S vy , 
n 

where S yy = (Sir) is the finite population variance-covariance matrix with 

Sir = (N- Z (Y,, -Y.) (Y rj -r,) 

An unbiased estimator of MSPME of y is 

ms pme(y) _ j) Sm where s yy = (s,r) is the sample variance- 

covariance matrix with 

s "' ~ ( n ~ 0 1 ^ (yj - yi ) (yj’j - y,). 

the Dr IX* h if nd ! Chattree(l989) in Certain s P ecifi ^situations have considered 

: ssr ««« 
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Models 

Mode / Based Estimators 


8.7 Super Population Regression 

Mode! Dependent Estimators vrs 

q 7 i Introduction . 

For a sample s from a finite popdaM^A wtt’ *efi mte pop ^ ^ ^ j( js 
X„n - Mi + is close to Z. r t . The fami.iar 

ttZOSXXZ \r a H„ k between J Tt an d f „ 

through the help of a design^). This is the design based approach where the 
randomised selection of the samples according to a given designer) the 
source of inference. Such an inference, termed as design based inference 
(DESI) is criticized on the ground that the inference is drawn in a restrictive 
situation pertaining only to the finite population under consideration and does 
not throw any light on other finite populations having more or less similar 
characteristics. Sometimes a super population model (2^) with an assumption 
that the finite population values Y\, ...,Y N are the realized values ofN random 
variables having a specified joint distribution, is only invoked to assess the 
efficiency of an estimator t and to choose the best strategy (p, t) by using a 
criterion E„,E p (t- Y) 2 . Both t and Y are regarded as random variables under the 
model and the random variable t is a predictor of the random variable Y. 

The model dependent approach, proposed by Royall (1970, 1971, 1978), 
Royall and Herson (1973), Royall and Eberhardt (1975), Royall and Cumber¬ 
land (198la, b) and others, advocates the evaluation and inference based on the 
super population model and a realized (fixed) samples, but not other possible 
samples. As such the sampling design/?(s) and the inclusion probabilities play 
no role in the inference and hence the concept of design unbiasedness is 
regarded as unimportant for the inference point of view. However, randomized 
selection, if any, safeguards only against selection bias. The model dependent 
inference (MODI) consists in finding out the best estimator by minimizing the 
model mean square error, given j, that is, E m [(f- Y) 2 \ s] under the constraints 
[('- Y)\s] = 0. The adherents of classical randomization method criticise the 
model dependent estimators on the ground that they often lack design 
consistency and can produce seriously design biased estimators when real 
situation differs from the assumed model. 

,, A ' th . ou ( gh the ," 1 ° del dependent inference is sensitive to departures from 
* e ™t, 'I 5 be ' leVed tha ‘ the pr °P er use ° fl ^ models has much to 

population 1 T ^ ^ c ontroversy as to whether the finite 
population inferences should be guided by consideration ofp alone, % alone or p 
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an d % simultaneously is not yet resolved satisfactorily, although some amount 
of reconcilations have brought into literature through the work of Godambe and 
Thompson (1977), Brewer (1979), Sarndal (1980) and Robinson and Samdal 
(1983). 

8.7.2 Model Dependent Estimation and Optimal Predictors 
To find a model dependent predictor of Y, write 

s s 

after the data d has been observed. Thus, the estimator t dependent on d is 
written as 

t = !Y k +t + , where t* =t-'ZY k . 

The prediction of Y boils down to prediction of random variablel Y k> since 

5 

I Yk known from the sample. Thus, Y is predicted by predicting %Y k from the 

s s 

assumed model and then adding this value toZ Y k . 

5 

Definition (Royall, 1970). A statistic V is called an unbiased predictor of a j 
random variable W, if V is ^-unbiased estimator of E m (W). 

Thus, t * will be an unbiased predictor of I Y k , if E m (t* )=E m (l T* ) for 

all 5. That is, E m [t-Y d Y k 'j=E m or,E m (t-Y) = 0 for all 5. 

Two types of mean square errors in this model dependent approach may 
be worth noting. These are: 


(i) E m E p (t - Y ) 1 and (ii) E p E m (t - p) 2 , where p = E m {Y). 

An suggested by Samdal (1980), if the real interest centres around the finite 
population in question then E m E p (t - Y ) 2 is the appropriate choice for choosing 
the optimal strategy and if we are to derive an optimal predictor of Y for some 
future population having the same characteristic as the present one having the 
same p, the real interest lies in p and the criterion is E p E m (t - p) 2 . For 
non-informative design (that is, whenp(s) is independent of y) the operators E p 
and E m are interchangeable. Now, 

M= E m E p (t- Y ) 1 


= EpV m (t) + E p [E m {t - Y )] 2 + V m (Y) - 2 E m [(Y- \Y)E p {t- p)] 




V m(t*) + Vm (lJ'*)-2cOV w 

mm ^ S 
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if t is /w-unbiased for Y, that is, model bias E m (t - Y) = 0. Thus, if %’s are 
independent, M is minimum if V m (t*) is minimum and thus for a given s the 
optimal w-unbiased predictor of Y is 

t opt = Y,Y k ■** ^ opt j 

where t* op t is a m inimum (model) variance (model) unbiased predictor of£ Y t. 

An optimal strategy ip*, t opl .) in the class (<P, 7) is therefore one for which 
M(p*, to P i ) < Mip, t) for all p e ‘P, t e T. 

It may be seen here that the form oit op t does not depend on the sample 
design as in case ofp-unbiased estimators (e.g. Horvitz-Thompson estimator). 
Thus, in w-dependent approach, emphasis is centred around the structure of the 
super population model and the predictor/. The choice of a suitable/? is only of 
secondary importance. 

A: Super Population Linear Regression Model without Intercept (d;) 

Yk = $Xk + S/r, 

where Xk s are non-stochastic and e*’s are independent random variables 
with E m (Sk) = 0, V m (Ek) = (5k 2 , covm(ek, si) = 0 for k * l. 

Under the £-Model, 

E m \ZY t ] = PZX k 

s J s 

Thus, a model unbiased predictor for S Yk should be such that its model 

5 

a 

expectation is pj] X k . If p, a function of sample values, is so constructed as to 

S 

A A 

make E m (P) = P then p^ X k is a model unbiased estimate of p 2 _,X k . Hence, 

S S 

t* = t-^Y k is an unbiased predictor of Y k , if 

5 j 

E m {t-I.Y k yE m {l 1 Y k )^I.Y k =E m \^i:x k 

' S ' 's' s L S 

which implies, 

t-J\Y k =PI^ A or / = TY k +PZ^ A 

5 5 J 

A 

If P = I l k Y k , that is, linear function of sample values, the condition 

S 

* * 

E m i$> - P) = 0 implies p YLl k X k = 1. Further, t is a linear model unbiased 

S 

predictor of Y. 

For a particular sample s, the model mean square error under £ is 
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MSE„,(f) = £„, (r-30 2 = y« (f - >0 

_ l 2 

=e„, pz** -pZJf* Ie* 

L 5 5 5 J 

= £„Jfl^*]l4(n-P^)-2;E* 

S ' S S 

r i 2 

= E m ^ % k ^h e k ” ^ ^ * 

_ 5 s s 

= I/ 2 cr 2 +Zo 2 

^ S ' s s 


The optimal ^-unbiased predictor is one for which MSE m (t) is minimum 
subject to condition £,,,((3 - (3) = 0 or equivalently ^l k X k = 1, The choice of 4 

which minimizes A^S'£,„(r) subject to condition ^l k X k = 1, is 


X k /vl 

4 =—■ 7 ~ , from which (3 = 

La X / O’ ^ 


/°i 




Hence, the best linear unbiased estimator or more appropriately best linear 
unbiased predictor for Y turns out to be 


Itti.m: or tm.iii’ = Z hj + 


Z Y k X k /vl 




ZX, 


,Vote: The BLUE of P as 8 iven above by P is same as the weighted least square 

estimate of P by minimizing £(y* -pX* ) 2 / a 2 using the generalized 
Gauss-Markov theorem. 5 


Particular Cases 

Case 1: a k 2 = a 2 X k 2 
Here, 



and 



256 


Finite Population Sampling 


MSEM = a 2 




^ 1 


n 


Case 2: a* 2 " 

P = l,Y k /lX k 


" 1 Y k + (? Y k X k j f Xk 


'Z\x, 

\X. 


which is the classical ratio estimator 


/x N 2 (. n 

msem =—{ i-- 


Xx- S 


k 5 J 


where 3c, and *, are the means of sampled and unsampled units in the 

P ° PU Thu"it follows that if n units with the largess’s are chosen to constitute 
the sample, then £»,(<« - Yf will be minimal, showing thereby that the opt.ma 
sampling design is a purposive one, which selects a sample of* units with 
largest Xk values with probability one. 

B. Super Population Linear Regression Model with Intercept(ri) 

Yk = Ot + pX/r + B k 

with £„(») = o, Vm (Zk) = O 2 and cov m (et, ej) = 0 (k* 1). 

Construct an estimator 

t = Y.Y k +'LhYk■ 

s S 

In order that t is an m-unbiased linear predictor ofY = Y,Y k + ^ » 

J 

should have 

x \. r 

in 


E m (r-in)=£ 

This implies E m = E m ^I^J 

from which Z4 ( a + P** ) = Z(a + pxjt )■ 


The required conditions are therefore 

i) i/*=n=yv-n, 
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Now 


ii) J Lh x k ~ £ x k 

s * 

V„, (t-Y) = Em (/-)0 2 

= E m \zY k +I,l k Y k -T:Y k -ZY k 

Li i 5 5 J 

= T.JlhY k -ZK* f 

s s 

r I 2 

= E„, T.l k Y k -L(a + P** +8* ) 

_ 5 S 

r “1 - 

= E m Y,l k Y k -L(a + (k* )-Ie* 

_ S 5 S A 


= E m H k Y k -I4(a + P** )-Xe* 


= £,„ L 4>4 -Ej:i k Y k -iz k 


= E m Wl k {Y k -E m Y k )~ Ze* 

_ 5 5 


— E m Y,l k £ k S E ( 


= 'LI}g 2 + (N - n)a 2 =g 2 Z Ij+N-n , 


Minimization of s.t conditions (i) and (ii) gives the optimum4’s as given by 

, (N 0| N(X-x) . 

4o=-1 +—-— ~( x k~x\ 

V n J L\x k 

5 

where 

X =J'— and J = Z —. 

*=i N s n 

Thus, the optimal predictor or BLU predictor becomes 
'blu =ZY k +Zl k0 Y k =y + $(X-x), 


where 
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T^jItand^Xtn -y)(x k -*)/!(** x ) 2 ’ 

Yl s ' 

which is called the BLU regression predictor and also can be obtained under the 

model by the Gauss-Markov theorem. 

The optimal model mean square error is 

MSErti (tBiA /) = Em (taw -Y) 2= Vm (Jblv - 10 
= (N-n)+ I / 2 0 ct 2 , 

S 

n Lixfr-x ) 2 

S 

which is minimum \fx =X. 

This shows that the optimum design turns out to bejt purposive one which 
rests on choosing the sample s such that x is close to X and this amounts to 
choosing a balanced sample. 

Now the question arises as to what happens when the ratio predictor//* = 
X(yfx) is used under the super population model (B) with intercept. It may be 
observed that //* is no longer remain model unbiased, that is, 

( x' 

E m (t R )=N a — 
v x J 

and the model bias is given by 

E m (t R -Y) = Na 

X 


which vanishes for the balanced sample. Thus, we reach an interesting 
conclusion that for the balanced sample, //* is robust in respect of model bias. 

The above discussions may also be extended to polynomial regression and 
multiple regression models. The most general form of these models are as 
follows: 

Polynomial Regression Model (PR) 


E m (jk | *k ) - I S y'p jX-' k , V m (Yk | Xk ) = a 2 v(xk) 

7 = o 

and cov„, (Y k , Y, \ x k ,xi) = 0 for k * I, where p/s and a 2 are unknown constants, 

f “ nction 5, = KO) depending on whether ft is present 
(absent). The best linear w-unbiased predictor of Y for a given 5 is 
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tBLVP - Z Yk + Z 5 /P j Z*/ 

5 / = 0 s k 

A 

where P/ is BLUP of P/ under PR -model and obtained by Gauss-Markov 
theorem. 

Multiple Linear Regression Model (MR) 

Em {Yk\xk ) = po XkO + Pi Xk\ + . . . + P/7 Xkp = x'fcp 

I ( Yk\xk ) = cr 2 and K* and K/ are uncorrelated for k * 1, where P' = (p 0 , pi,.. 
p p )anda 2 are unknown constants (parameters);xV = (*Ao,*/H,.. .,xk P ) withx*o = 
1, for all k and v* = v{xk) are known. 

The BLUP of Y is tm.ur = Z Y k + x' s p where p is BLUP of under MR 

S 

model and obtained by Gauss-Markov theorem. 


8.7.3 Optimality of the Generalized Difference Estimator 
Model based estimators: Consider a general super-population model (GM) 

PA: — Em (Yk) ~~ P Ok + bk 

Ok 2 = E m {Yk - PA :) 2 = Ok 2 O 2 

Gkt = Em {Yk - p*) {Yi - p/) = ak ai p a 2 {k * 1), 

N 

where ak>0,bk {k = 1,..., AO are known numbers, £ a k - N■> and P. and P 

k=\ 

are unknown, - 1 /{N - 1) <p< 1. 

(Yk-bk) 


Z t = 


^k 


',* = 1.,/V, 


is assumed to have common first and second moments. 

The p-unbiased generalized difference estimator is given by 


tgd - z 


\Y k -e k ) 

NKl 


N 


+ Z e k /N 

k= 1 


where e = {e\, . . ed) is an arbitrary real vector and ti* is the inclusion 
probability for the unit labelled/:. 

The minimum E m V{p, tgj) is attained if we choose e k = p* and n k a Ok{k = 

•. N). po =po(s ) of the optimum strategy {po, tgjo) satisfies nk =fa k {k= 1 ,_ L .., 

N), where/ = n/N. Thus, the optimum generalized difference estimator off is 


t gd ^ 


{Y k -e k ) 


na k 


+ f i b k /N 

k= 1 
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v v he the class L u of all ^-unbiased linear homogeneous or 
Let / = w os + I>*s betheclas u F theorem due to Cassel, 
non-homogentous estimators of Y. An important theorem 

Sarndal and Wretman (1976) is stated as follows: 

Theorem 8.3 (CSW): Under the model (GM) 

E m ( V(p, t)) > Em l V(p,„ = (1 - P) (1'- f A ) ° 2/ "’ 

where A = W/K and (p,t) is any competing strategy such that/re L„ and p(s) is 
any fixed sample size design of sizen with it* > 0 (* — 1.")■ 

Multiple Regression Model 

Assume a regression structure relating the variable y to p independent 
explanatory variables * 2 , specified by the multiple regression (MR 

model E,„(Yk) = fc + Pi »i + • ■ ■ + = *'* P = M* and Vm " ° 2 Vk ’ whe '‘ e 

= (po. Pi,..( 3 />) and a 2 are unknown parameters;*'* - ( Xko , xk),..Xkp) withxw 

= 1, for all k and v* = v(x*) are known. Yk s are independent. 

Consider a sample s selected from a finite population U = (1,. ■AO with 
p(s) > 0 and Y p(s) = 1 with first order inclusion probabilities 7t* = £ p(s ). 

S s * k 

The/ 7 -unbiased generalized difference estimator for?under MR model is 
written as 


tgd ~ £ . + X Ps/ x j ^ 


where 


* . = y —L 

' * .1 N 


The corresponding class of generalized regression estimator as suggested 
by Sarndal (1980). 


'r = Z-£- + Zf»* 


j A/ti, 


A AN A 

where p', = (p, 0 , P,i,..p v ) is linear in those Y t for which hi. 

A form suggested forp', is p', = G\ Y, with G'., = (W\ X.,)" 1 W\, where (i) 
Y, is the column vector with* elements Y k , te s , X, the nx(p+ 1) matrix of 
vectors xt', k e s, as rows, (ii) IV, is a matrix having n rows (p + |) columns 
having elements which may or may not depend on knownx'* and v* <w M 

matrix 2^klT ‘° (fii) K ' i$ ,he dla g°" al 

P. is model unbiased for p. becomes model unbiased under MR 
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model. Substitution ofp^forP' might introduce a design bias and we shall be in 

a happy situation if it vanishes asymptotically. In situations where the model 
goes wrong, there is likelihood that the inferences based onfy- are still protected 
by the randomized selection assuring design unbiasedness, thus producing a 
source of robustness. Asymptotic design unbiasedness has been discussed by 
Brewer (1979) and Samdal (1980) as a practical method to protect against 
model mis specification in large scale sample surveys. 


Different choices of fV' s would give different estimators p. v thus 

determining / gr . Sarndal (1980) considered two choices of weighting such as 
7r-inverse weighting arid best linear unbiased weighting. 

In 7i-inverse weighting l'. v 7t. v -1 = c' fV' x for all s, where l. v is a column 
vector of n ones, n. v the n xn diagonal matrix having^*, k e s on the diagonal and 
c' = (co, c i,..., Cp) is a vector of constants. 


Equivalently 7t^ cj w kj (k = 1,..., TV). If pwy be the estimator of p, 


,/ = o 

arising from the7i-inverse weighting,^reduces to a simple and appealing form 


p _ * 

•tr = Z*/ P p! i 
/ = 0 


~N 


‘ ’ ^ ^ ^ f A, 

+ ••• + ^*ApPo 

k = \ k= 1 k = 1 



The best linear unbiased weighting is defined as W x = V s ~ x X x , where V x is a n x n 
diagonal matrix, having vie, k e s, resulting in 

P=Pm/=(x;-v;' x, )-■ x;v-> y. 

This gives 

f gr = BLU 

7 = 0 

Sarndal (1980) suggested three estimators (rather predictors) of the finite 
population total Y in the event of being any model unbiased estimator ofp. 
These model unbiased estimators, i.e. E m {t - Y) = 0, are 



k= 1 
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=s Y k +s**p=i (j'* -**p) + Z** p 

r C 5 V 7 k =1 


N 


<r-s(n-j4 p)*; i + £ j( * p 


/ V 

It may be observed that /,4 is more a natural predictor of E m (Y) = J^x k P, than of 


N 

z 

k = 1 


K itself. t H is a natural predictor of Y because it predicts only the unsampled part 
of the population, tc is the well known Generalized Regression Estimator 

(GREG) of Cassel, Sarndal and Wretman (1976). 

The above three sub-classes of estimators are seen to be the particular 
cases of a single class of estimators called QR predictors (Wright, 1983) given 
by 

N 

tQR = Zj Xlc P ^ r k E k > 

* = 1 5 

where P = f X q k x k x 'k 1 ^ c lk x k^k 

V 5 ' S 

and £ k = Yk - x'k P, qk > 0 and r k > 0 being open to choice. 

It may be seen that various choices of q k and r k yield familiar and 
meaningful predictors. For instance, 


i) r k = 0 for all k , yields simple projection estimators (SPRO) 

N . N A 

*QR = Zl x 'k P = X Yk = t A 

k=\ k =1 


ii) ru =1 for all k yields the Linear Prediction Estimators (LPRE), given by 
t QR =TY k + IF* =t B . 

S S 


iii) for r k = \/n ki we have the Generalized Regression Estimators (GREG) 

N « c , 

tQR = X Y k + X- — tc • 

i 5 n k 

The beauty of QR predictors stems from the fact that these model based 
estimators can manage to conform to design based inference requirements 
(DESl) including Asympototic Design Unbiasedness (ADU) by appropriate 
choice of free constants. 
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t A and tn can be accommodated in a single general predictor such as 


/AB I.^/r x k X k » 

where J is a scalar taking values either zero (tAB = tA ) or 1 (tAB = tB ). 

Note I: If/? = 1 and there is no intercept 

(3 = IK* (n~ ] -J)y/lx* (n~ ] - J) 

i) When 7=0, (ab leads to Horvitz-Thompson ratio estimator of K, that is, 


Yrat-HT = ^I K* 7i“'^ I a'* 7i “* J A', where A' is the total of x 


ii) When J = 1, leads to Brewer’s (1979) estimator of K, that is 




Y b -IK* + ^a* -Ia* 

,/=i s 


( 7t 4 l (''j' “’)] 


8.7.4 Asymptotic Design-Unbiasedness 

A predictor t is said to be asymptotically design unbiased (ADU) if and 
only if lim [K -£„(/)] = 0. The concept of limit follows from works of 

Brewer (1979), Isaki and Fuller (1982), Sarndal (1980) and Robinson and 
Sarndal (1983). 

Defining ton as a predictor of K= EK*/7V, its asymptotic distribution 
can be explored by defining the random variable 

1 N 1 

t'oR -~jj y Yj X 'k$ + —'£‘ r kZk, 

A “ 1 


N 


where 

P = 

2> 



\k = 1 

and 

s k : 

= K* - 

Now 

tQR 

~ t' QR 

Ensuring 

P ^ m ( 1 

P“P) 


//—> co V 

/ 


N 


k= 1 


_1_ 

N 


£x'* (P-P)-I r k x\ (P-P) 

k = ] 
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order inclusion probabilities and limiting population moments of** and ft, it 
may be proved that lim (ton - ton) = 0 - This ,eads t0 define a sufficient 

condition for ton to be ADU given by E p (ton) - Y = 0 for all Y, that is, t' (m i s 
exactly design unbiased. Another equivalent sufficient condition for ton to be 
ADU for Y is that there exists a/?-vector X = (X\, .. X p )' such that 1 - 7 t*r* = 7 t* 
qk x*' X for all k e U. 

Some Important Results 

a) The GREG estimator being defined by the choicer* = 7 t * _1 in ton is ADU 
regardless of the choice of qk- This is because every GREG predictor 
fulfils the ADU condition. 

b) ADU-ness and cosmetic form such as SPRO or LPRE can be achieved at 
the same time by choosing the < 7 * to satisfy the ADU condition for a 
particular X. 

c) If ton is the QR-predictor satisfying ADU condition with < 7 * and ft and if 
tanixi is the generalized regression predictor using the same#*, then ton = 
tonne, for all Y e R N and all samples s. 

d) If ton satisfies the ADU condition, then its asymptotic expected variance 

-D 2 =lO 1 ;' 

*=1 

which is also the lower bound of the expected variance for 
design-unbiased estimators (Godambe and Joshi, 1965). 

For details about ton predictors and ADU-ness the readers may refer to 
Wright (1983) and Sarndal and Wright (1984). 

Brewer's Class of Robust Sampling Designs 

Let us assume a no intercept model 
ft = (3 X* + £*, k = 1,..., N % 

with E m (e*) = 0, E„, (e* e/) = a* 2 if k = /, and equals zero otherwise. Define a 
class of predictors 

F = +PL** 

S S 

where (3 is a sample estimator of p. Brewer (1979) used a general homogeneous 
linear and model unbiased estimator 

V B ={'LW k Y k )l['LW k x k ) , 

as an estimate of|3, where the choice of 1 ft is completely arbitrary with 
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only condition that Wk's are bounded. Thus, Brewer’s predictor may be defined 
as 

t*B = £ Yk + £ x k • 

S 5 

For a reasonable choice of Wk* s, a condition is imposed such that // with 

should be asymptotically design unbiased (ADU) and its asymptotic variance 
should also be minimum. 

Keeping these considerations in view Brewer (1979) has suggested an 
asymptotic framework and resulting analysis which basically rests on choosing 
the selection probabilities, so as to minimize the design expectation of the 
model based mean square error of the predictor. The formulation of the 
problems is as follows: 

1. The original population of N units is reproduced (T — 1) times yielding T 
populations of N units each. 

2. A sample s, (j = 1,.. T) is selected from each population with the same 
sampling design and hence the same7ik. 

3. The T populations are aggregated to a population of size NT units with a 
population total Yr = TY. 

4. The T samples (s\, si ,..., sr) are aggregated to a sample s(T) of wTunits. 

5. The estimator t*ur, based on the aggregated sample, of the population total 
Yr is then formed by using// and fy?. 

6. T is allowed to tend to infinity. For the asymptotic analysis we deal with 
tn/N rather than with //. 

Now, lim E d 

T-+oo r nt 


N 

f N 

IN \ 

N 1 

X 71 * + | 


' 'Ln k W k x k 

M 

1 

_k= 1 


y 

* = ! J 


In order that t* hi/NT is asymptotically design unbiased for Y/N, the coefficient 
of Yk in the square bracket must be unity. This implies 

^ _ {2 71 k ^k x k } a 

A {Zn k W k x k +W k Z{\-n k )x* } ~ (a + HP*)’ 

{Ln k W k x k } 

where a = --— 

2(l-n* )x k 

The optimum values of W k and it* are obtained by minimizing the 
asymptotic value of the design expectation of the model variance oit\ r /NT. 
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For convenience, the minimization of limiting value of expected variance of 
ThtNNT is taken up. Thus, 

i-. r (**bt ~ y t ) 2 1 


jun E p E m -— 

r-*<n r /V/ 


= N~ ] £(l-7t*)** 


'Ln k W 2 c 1 k + N ’1(1-71* >j2 


Substituting n k = a/(a + fV k ), we have 


, mEpE \ E*Lshn j± Wk «AL 
T ->* p L NT J l/rt )/ 


By Scwartz’s inequality, 


< a£(a + Jr*)-’ a" 1 £(a + W k )a; 

U = 1 J k = 1 k = ] 


( N \ r N 

= X 71 * a_1 

v*=1 y *=i 


Hence, 


= n[lol +a-' Eifjon 


i 1 ( N ^ N 

rI"W-- I G * 'S' 1 *- 


a rr, 


» U = 1 


The equality is attained and Hm -Y T ) 2 /NT is minimized, when 


from which 


W k = a n 1 o k ] Yj G k -1 » 

^=l 


*k = «<**/ 

/ * = i 


Thus, Brewer’s estimatorin terms of7c* may be written as 

t* B = I Yk +f I Y k{*~ k -1)/ ^ ** 0 1 *' _1 ) 

k £ s k € s ' j k & s s 
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N 

For large N,E p V,„ {t*s )= ( n ~k "O’ which is equal t0 Godambe - Joshi 

k = 1 

(1965) lower bound of the expected variance of any unbiased estimator given 
those 7 i*’s. This shows that Brewer’s estimator ta* is asymptotically most 
efficient. The minimum is further achieved with respect t07u for the class of 
designs p*e p„ and nk(p*) « a* for all * = 1, . . N. Thus, the asymptotitic 

l ( n \ 2 n _ 

minimum expected variance reduces to— Zcj* - Zcrj-. ADU-ness of tn 

n\ ) 


does not make any reference to super population model and the only 
requirement is to choose Wk °c (7T/T 1 - 1 )■ Of course, the form of the predictor t b 
pre-supposes a super population model. 

Note 1. Consider an estimator of type Y* =P/?X Imitating Brewer’s asymptotic 

analysis Mukhopadhyay and Bhattacharya (1988) have shown that the 
optimum ADU estimator in this class is the Hajek’s estimator 






with asymptotic expected variance attaining Godambe and Joshi lower bound 
given the nk values. 

Note 2. It may be further observed that Royall’s (1970) estimator 
Yr L = Z Y k + P blu £ x k > 

S S 


where 

P blu =(l>*** la 2 ^ /oj^, 

A A 

and the estimator Y* R * L =p blu X are not ADU, although for a given sampling 

design both of them have the asymptotic mean square error equal to the 
Godambe-Joshi lower bound. 


Note 3: Royall’s estimator, which is completely model dependent may be 
seriously biased in case of a wrongly chosen model, where as the Brewer’s class 
of estimators (1979) although model based takes into account design aspect into 
consideration to satisfy asymptotic design unbiasedness and asymptotic 
efficiency, thus providing greater protection against model mis-specification. 

Note 4\ Writing 

Y= Z Y k +dx- Z X k ), 

k es V k €5 / 
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where 

p = (Y-I.Yk )/(X-'LXk ), 

Rao (1984b) considers a predictor 

Yc = Z Yk + Pc k » 

where p<; as suggested by Godambe (1955) is given by 

p c = I (P,t-1)1'*/2 (P 

Following Basu (1971) po may be justified as some sort of weighted average of 

the observed ratios Y k /X k , the weights being proportional to (P.,* - \)X k . The 
optimum estimator belongs to Brewer’s class of esimators with asymptotic 
expected variance equal to Godambe-Joshi lower bound. 

Robinson and Tsui (1979) suggested a different asymptotic framework, 

considering a sequence of populations 

p(J): {Yk, x k , W /k , k = 1, .. Nj)j = 1, 2 ,.. . with N, °° as j -> «. 

A sample s t of size n/ is drawn from using similar sampling design for 
ally, having inclusion probabilities n^ kk >. Thus we have a sequence of 
predictors 

tiu)=^k +p (/) 

Sj s i 

where 

(3<'>=Z fV jk Y k ljLW jk x k . 

S J / s i 

Under assumptions: (i) Y k and x k are bounded uniformly in k and W, k in 

(j\k), (ii) — Z Wj k x k is bounded away from zero, uniformly in large/ for all Sj 

N j s, 

with lim —^-<oo , (iii) lim sup -n ^ -n { /} =0, (iv) for large y, 

i -* 00 k *k'<,N, 

n { k n -a j l (a j + W jk \ k = 1,. .., N h where 



i_ 


1 

1 — 

-1 

p 

fl 

^WjkXk 

1 


-M 

1 


Robinson and Tsui (1979) proved that 
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lim 

/ 00 


A 

Nj 



1 



= 0 


where Y(J) is the population total for the yth population, showing thereby 
that t H '(j)/Nj is mean square consistent. Since tn* (J) is bounded, it is 
asymptotically design consistent (ADC), that is , 

P [\^(J)/Nj - Y(J)/Nj | >8] -> 0 as j -> oo 
and ADU, that is 

lim E p 

j —> °0 H 


[ B 


n j 


Y jn_ 

n j 


= o 


Robinson and Tsui (1979) derived the asymptotic value of design 
expectation of model mean square error o ftn*NNj asy -> oo and this happens to 
be the same as the one obtained by Brewer and so also the optimum sampling 
design. 


ADU-ness Under the Set-up of Continuously Nested Populations 


Robinson and Sarndal (1983) considered a generalized regression predictor, 
motivated by a super population model £ and established that the predictor is 
asymptotically design unbiased (ADU) and design consistent (ADC) under 
conditions which do not require the super population model to be correct. Using 
similar conditions they obtained as asymptotic mean square criterion and 
minimized it for given inclusion probabilities under the assumption of the 
correctness of the super population model. This minimum value is further 
minimised by choosing the inclusion probabilities proportional to the model 
standard deviation. Their asymptotic framework rests on the following: 

a) There are T nested populations /*'> c F* 2) c ... c PM. pO') is of size Nr 
with N\<N 2 <...< Nr. 

b) A sample sj of size n, is selected from Pd) following the sampling design 
with inclusion probabilities ti^W, where 


71 


(./') _ 


= I 

s)k 

seP 1 ' 1 


p(s) and *<'>,= z p(5), 7=1,2, 


sik,k' 


c) 


Associated with ^th unit of P® i s a random variable Yk as well as a set ofp 

+ 1 r f’ num ^ ers Xk n <' = °. 1 — Pi Define * = (*«>, .^y, the 

population p0) mea n for Y k as 

1 


V / J\ 
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and the population mean foras X(j)- — £ ** 

IV i k 


The general regression predictor (Samdal, 1980) motivated by a supposed 
super population regression model linking the dependent variable Yk to the 
independent variable A* is given by 


,U) _ J_£ 

GR Nj r 




7t 


(./) 


-IP 

r = 1 


U) 

P 


( /U) 

> 

1 

k 

_ 1 

*• 

1 

- ( /) 

J 


The p! 7) are functions of IW and YW such that N\m is computable from sample 
information, where 



The ADU and ADC are defined as follows: 

Allow p0) to expand by making A, -» oo asy —> oo and so also rij -> oo, but 
increase of n, as fast as A, is not required. The infinite dimensional random 
variables (Y i, Yj, . . .) follows ^-probability distribution with E„,(Yk ) = \ik, 
V,„( Yk ) = Ok 2 and co\(Yk, Yk') = 0 for k * k!. 


Definition 1: A predictor is ADU if 


lim [E(t 

/—> 00 


(./•) 

GR 


YU ))- y o’)]=o 


with ^-probability one. 

Definition 2: A predictor is ADC if for all e > 0, 


lim P[ 


• oo 


t U) _ y (./') 
C7J? 


>6 


|Y</)]=0 


with ^-probability one. 


Note : By Markov’s inequality 






(./) 

(7/? 



Taking limit of both the sides, the ADU-ness ofA\;/< implies ADC of AW 

Theorem 8.2: The generalized regression estimator tun is ADU and ADC 
under assumptions 

1. lim At 1 Z*; < oo for r = 0, 1,. . ., p 

i -*<n 1 k 

2. lim A -1 Z Y? <oo with ^-probability one 

j —* <*> ! k " 
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3. lim E\ 

/->oc 


P * 

Ip?I y 

/•=i 


(./) 


< oo with ^-probability one. 


4. lim N : min 7i (/) =oo 

; 1 \<k<N. k 


/-►oo 


5. lim max 

/ —» oo \ <k * k' < N j 


n 


(./•) 

kk' 


n[ j) 


- 1 


= 0 


where = F(A W = Ik ,<j) = 1). 
Proof. Write 


r = I 


where 


1 


AM =—Z Y k 
N i * 


* -1 


71 


(./) 


and = —!— Y.x kr 


N j k 


I\ j) 


n 


(./) 


-1 


, 0 <r<p 


Thus, E[\t^ - Y (y')| YU> ] 


<[E(A 2 M |YU') )]^ + 



( P --w-v 

\ 

E 

Ep? 0) 

YU) 


\r = 0 

y 




Now, 


E (A 2(i) 

/ 


71 


(./') 


-1 


+-i-sz^y*. 

A/? *.*' 


71U) 


jr U") (./) 

L 71 * 71 A' 


-1 


— F\ + F 2 


F 1 is dominated by 




(./) 




-1 


, AF min 7t, , 

V J \<k<N f k J 


Nf lY 2 ->0 


as / ->oo with ^-probability one, under assumptions (2) and (4).F? is dominated 
by 


max 

\ < k * k' < N , 


0 


7T 


(/) 

kk' 




-1 




N~' 11 Y t 

1 k 


< max 

\<k*k’<N, 


71 


(./) 

kk' 


Tl M U) 


-1 


Wj 1 ZK/ 

' k K 


with ^-probability one, under (2) and (5). Thus, E(A 2 ^ |y0>) -> 0 with 
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4 probability one as j oo. Further, arguing similarly it can be shown that 
E(B r 2<J) ) -> 0 asy -> oo, 0 < r <p, using (1), (4), (5) and thereafter considering 
(3), the proof follows, i.e. 


}tA\^- 7U) 



showing thereby the ADU of tc,$\ 

Robinson and Sarndal (1983) have also derived the optimal choice oftt*(/) 
by minimizing the unconditional MSE ofunder a super population model 
along with a set of assumptions and this happens to be 71 * 0 ) = n a* / Da*. The 
minimum value of MSE is the minimum expected variance of any design 
unbiased estimator. 

For other asymptotic results the reader may refer to Isaki and Fuller 
(1982). Using the super population approach to a finite population Rodrigues, 
Bolfarine and Rogatko (1985) have developed a general theory of prediction for 
linear and quadratic functions of the population values. 

Mukhopadhyay (1990) considering a multiple regression super 
population model has suggested a generalized predictor of finite population 
variance under unequal probability sampling. The predictor is found to be ADU 
and ADC under conditions which do not require the assumed super population 
model. 


NUMERICAL EXAMPLES 

Example 8.1: For the artificial population in Table 8E1 (Cochran, 1977), 
compare the efficiency of (i) Simple Mean per unit estimator (ii) Ratio 
estimator based on ratio of means (iii) Ratio estimator based on mean of ratios, 
(iv) Unibiased Hartley-Ross ratio type estimator, to estimate the population 
mean with a SRSWOR of size 2. 

Table 8 El. Artificial population 


Units Y x 

1 0.3 1 

2 0.5 1 

3 0.8 2 

4 0.9 3 

5 1.5 3 

Solution 


i) Simple mean per unit estimator 

t\ = y (sample mean). 

MSE (y) = [(/V - n)/Nn] S 2 = 0.063 

ii) Ratio estimator based on ratio of means: 
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t 2 =y 


T 

V x 

MSE (ti) = E(t 2 -Y) 2 = 0.0138 
iii) Ratio estimator based on mean of ratios: 

X = rX 


t-l — 


— S — 

X; 


n 


MSE (/ 3 ) = £(/3 ~ >9 2 = 0.012 
iv) Unbiased Hartley-Ross ratio type estimator: 
- {n(N-l)} 

'*= TX+ J^^ 

MSE (r 4 ) = E{U-~Y) 2 = 0.017. 

Comparison of Efficiency 


fi 

100 


456 


525 


U 

370 


Example 8.2: For a hypothetical small population of size 5,y and* values are 
given in Table 8E.2: 

Table 8E.2 

Ui 1 2 3 4 5 

Yi : 3 5 8 4 5 

X 5 2 6 4 3 

Compare the approximate (first order) and actual biases and mean square errors 
of the ratio estimator of the population mean Y for samples of size n = 2,n = 3 
and n = 4. 

Solution 

Y=5,X = <\,N=5 

SV 2 = 3.5 S x 2 = 2.5 p = 0.3380 

C, = 0.37416 C x = 0.39527 


Approximate bias = 


i- —Y Yy 


\n J 


[C, 2 - pC,C v ] 


Actual bias = E 


m 

\x) 


X 


-Y 


Approximate MSE = [ 1- — 

V NJ 


n J 


[C v 2 + C v 2 - 2pC,Cv] 
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Actual MSE-£ 

Compar ison of actual and approximate 

Bias _ n = 2 

Actual 0.4160 

Approximate 0.1594 

Percent error in 

approximation_ -61.68 

Comparison o f actual and approximate MSE: 

M.S.E. _n=_2_ n = 3 

Actual 1.8550 0.7018 

Approximate 1.4720 0.6540 

Percent error in 

approximation - _ - 20.65 _ -6.81 

The computations corroborate the expected results that the error in bias 
and MSE decreases as n increases towards N. For small sample size the 
approximation underestimates the MSE. 

Example 8.3: A pilot scheme for studying milk yield, breeds, feeding and 
management practices of cattle and buffaloes was conducted in eastern Uttar 
Pradesh during 1957-58. The Table 8E.3 below presents total number of milch 
cows in the 19 selected villages of dry region as enumerated in rainy season 
1957 and as given by Census, 1956. 

Table 8 E.3 


Number of milch cows 


SI. No. of Villages 

Rainy season 1957 

Census 1956 

1 

35 

47 

2 

38 

46 

3 

71 

253 

4 

4 

19 

5 

63 

121 

6 

4 

4 

7 

14 

5 

8 

7 

7 

9 

66 

50 

10 

44 

162 

11 

8 

9 


o = 4 

0.2499 

0.2453 

-1.84 
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12 

229 

13 

27 

14 

30 

15 

29 

16 

29 

17 

97 

18 

30 

19 

40 


256 

74 

28 

41 

27 

25 

40 

66 


Estimate the total number of milch cows with its standard error for the 
rural area of the dry region in rainy season 1957 by ratio and regression 
methods of estimation. Calculate their relative efficiencies compared to mean 
per unit estimate without the use of 1956 census data. It is given that the total 
number of milch cows in the rural area of the region is 576,082 and the total 
number of villages is 9179. 

Solution : Let;;, denote the number of milch cows in rainy season 1957 and 
x, be the number of milch cows according to census, 1956 for the/th selected 
village. 

X = number of milch cows according to census 1956 = 576,082 
N = number of village in the region = 9179 
n = number of villages in the sample = 19 

Ly, = 865 Ex, = 1280 R = (Zy/Ix) = 0.6758 

Zy,x,= 109,389 s y 2 = 2588.48 

^ 2 = 5933.69 s xy = 2839.74 

b yx = 0.4786 p 2 = 0.5250. 

i) Estimate of total based on mean per unit 

Y = (N/n) 'Lyj = (9179/19) (865) = 417886 

Est. s y 2 = 1145 x 10 7 

n J 

Est. S.E. (f)= 107 x 10 3 . 

ii) Ratio method of estimation: 

y = f 5^1 X = f —1 (576,082)= 389305 
* lzx,J 112807 

Est V(Y r )= N ( N ~ n l [Sy 2 + R 2 s x 2 - R Sxy] 

n 


= 646 x 10 7 
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EstS.E. (K«) = *0- 3 * l()3 - 
iii) Regression method of estimation. 

Y,c K = Niy + byx(X-x) ] 

= 9179 [45.5263 + (0.4786) (62.7609 - 67.3684)] 


= 397,645 


Est. V(Y reg )= 


N(N-n) 


n 


Sy 


2 (i _p2) = 544 x 10 7 


Est. S.E. (Yrcg) = 73.7 x 10 3 . 

Gain in efficiency of ratio method of estimation over mean per unit 



Lj 

\ (1145 -646)] 

V(Y r ) 

[ 646 J 


x 100 


= 77.24% 

Gain in efficiency of regression method of estimation over mean per unit 

100 


{v(.Y)-V(Y reg )} 

Lj 

f (1145-544)1 

V(Y reg ) 

[ 544 J 


= 110.47% 

Example 8.4: The data in Table 8E.4 were obtained in a pilot survey to study 
the extent of cultivation and production of fresh fruits in certain region of Bihar 
(India) in 1967-68. 

Estimate the total number of trees in the region both by the separate and 
combined ratio methods of estimation and compute the relative efficiency of 
the separate ratio method of estimation compared to the combined one. 

Table 8E.4 


Stratum 

Total 

number of 
villages (A/,) 

Total area 
orchards (in 
acres) (X,) 

Area under 
orchards (in 
acres) (x,) 

Total 

number of 
tress (yi) 

1 

985 

11253 

10.63, 9.90, 

747, 719, 




1.45, 3.38 

78, 201, 




5.17, 10.35 

311,448 

2 

2196 

25115 

14.66,2.61, 

580, 103, 




4.35, 9.87, 

316, 739, 




2.42, 5.60, 

196, 235, 




4.70, 36.35 

212, 1646 

3 

1020 

18870 

11.60, 5.29, 

488, 227, 
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7.94, 7.29, 

8 . 00 , 1 . 20 , 

11.50,7.96, 

23.15, 1.70, 

_ 2.01 

Solution : Compute 

. = {W^ = 44153 = 5q4o 

{LiNi/N^i} 3.80 

3 

2T = £*, = 55238. 

i = 0 

Combined ratio estimate : 

Y C r = R.X= (50.40) (55238) 

= 2,783,995 

Est. K(r (7( J = IJV, 2 (— - — I (i 2 ,y + « 2 s 2 „ - 2fl 5^) 

v«/ Av 

= 6,019,519,627 

Separate ratio estimate : 

rti = 61.28, = 49.99, £3 = 42.66 

f SR = £*,*,= 2750077 
/ = 1 

* z' 1 1 \ A 

Est. K(Tm) = W 2 -— (s 2 i y + Ri 1 s 2 ix - 2Ri Sixy) = 2,441,137835 

/V/7 

Efficiency of separate ratio estimate^ over combined ratio estimate Y CR 
is given by 

[Est KJYvO/Est. K(f^)] xl00 = 246.58% 

Example 8.5: A survey was conducted by Indian Council of Agricultural 
Research in Varanasi district of Uttar Pradesh (India) with the object of 
studying the cultivation practices and yield of mango crop during 1960. 
Table8E.5 presents estimated yield of mango and number of bearing trees of the 
19 selected villages in Gyanpur Tehsil of Varanasi district, U.P. The Villages 
were selected with replacement probability (pi) proportional to the reported 

area under orchards. The total number of bearing trees in the Tehsil is counted 
to be 20551. 

Estimate the total yield of mango in Gyanpur Tehsil using ratio method cf 


374, 491, 
499, 50, 
455, 47, 
879, 115, 
115 
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estimation along with its standard error. 

Table 8 E.5. Yield of mango and number of bearing trees in 19 selected vil¬ 


lages. 


Serial No. 
of village 

Total yield of 
mango (kg) 

Total No. of 
bearing trees 

1/p/ 

1 

1784 

82 

147.6 

2 

1383 

86 

266.4 

3 

804 

30 

1598.7 

4 

878 

29 

245.9 

5 

8504 

75 

195.8 

6 

1506 

42 

1199.0 

7 

1280 

39 

504.8 

8 

1563 

45 

282.0 

9 

2125 

34 

456.8 

10 

693 

28 

417.0 

11 

459 

23 

479.6 

12 

473 

28 

383.7 

13 

847 

23 

599.5 

14 

2378 

42 

399.7 

15 

1751 

32 

1598.7 

16 

990 

54 

639.7 

17 

1380 

50 

396.7 

18 

418 

43 

47.5 

19 

360 

33 

479.6 


Solution: Let yi and*, be the yield of mango and number of bearing trees 
respectively in the ith village and letAi (in acres) be the area under orchards in 
the ith village of the tehsil. 


A= I Ai =20551 

i=\ 

Pi = the probability of selecting ith village with replacement, 
N = total number of villages in the tehsil, 
n = total number of villages in the sample 
Define z, = y/NPi = l/N, where U = y/Pt 
and v, = Xi / NPi = I'/N, where /', = Xi/Pi 


The estimate of total yield of mango is given by 


Y it = (Zz,/Zvi)X= 

= 35.8096 x 20,551 = 735923 kg 


Est. V{Yr) = 


N'■ 


[n(n - 1)J 


{L(zi - R n Vj) 2 }, 




I 
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where R n = (L/, /!/',) = 35.8096 
On substitution, 


Est. f(f*) = 


1 

n(n -1) 




= — - —[359, 140, 210.229] x 10 4 = 1050117.5 x 10 4 (kg) 2 . 

19x18 

Estimated standard error of Yr = 1024.7 x 10 2 kg. 

Example 8.6: In the example given by Shukla (1965) on a trivariate population 
of size 50 jute plants where the characteristics measured arey = fibre weight in 
gms, x\ = plant height in feet and *2 = base diameter in cm. The population 
parameters are 

Y= 5.69 gm,^i = 6.65 feet, *2 = 1.42 cm 

S 2 V = 18764,5 2 = 0.3819,S 2 =0.0264 

y **2 

P W = 0.7418, p yX2 = 0.5678,p^ 2 =0.2063 

Compute the efficiency of the following estimators with respect to simple mean 
per unit ( y) estimator to estimate the population mean with a sample size«. 

(a) Ratio estimator with x\ only; (b) Ratio estimator withx 2 only; (c) 
Regression estimator withxi only; (d) Regression estimator withx 2 only; (e) 
Olkin’s weighted multivariate ratio estimator; (f) Multivariate regression 
estimate. 

Solution: Variance of simple mean per unit (y) estimate 
V{y) = V-f)(Y 2 ln) (0-0580) 

Variance of ratio estimator with xi only 
V{y r \) = {\-f)(Y 2 ln) (0-0334) 

Variance of ratio estimator withx 2 only 
V{y r i) = {\-f)(Y 1 ln) (0.0399) 

Variance of regression estimator withxi only 
f(>i) = (l-y)(f 2 /«)(0.0261) 

Variance of regression estimator withx 2 only 
V(yir2) = (1 -f) (Y 2 /n) (0.0393) 

Variance of Olkin’s weighted multivariate ratio estimate 
V(yvmr) - (1 -f) (/ 2 /n) (0.0308) 

Variance of multivariate regression estimate 
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V{y m re S ) = {,\-f){YM (0.0153) 

Comparison of efficiency. 

y yr\ yrl y\r\ y\rl ywmr ymreg 

100 173.65 145.36 222.22 147.58 188.31 379.08 

EXERCISES 


8.1 Given the data in Table 8.1 (Sukhatme, 1954) compute the relative 
effciencies of simple ratio estimator compared to linear regression 
estimator and Hartley-Ross unbiased ratio estimator for a sample size 3. 

Table 8.1 Mean Agricultural area and Mean number of Livestock 


Unit x-values 

(agricultural (mean agricultural 

area unit) area) 


y-values 
(Mean no. of live 
stock) 


1 

2 

3 

4 

5 
7 


63.7 

155.3 
245.7 

344.4 
767.52 

1604.0 


25.4 

50.1 
76.0 

99.2 

44.4 

425.1 


8.2 Given a PPS sample with replacement, estimate the variance of the 
population total under simple random sampling with ratio method of 
estimation, taking x as the auxiliary variable. 

8.3 For a SRSWOR design where the paired values (y„ x,) are observed on 
each unit in the sample of size n selected from a population of siz eN, 
derive the biases and mean squares of the following estimators: 

tMN- 6 rX+ (1 -$)rX, where 6 = 

{N(n-l)} 

tHii = drX+(\ - 8) r X, where 0 = 0} 

{N(n-\f X ' X) 


Ip 6 r X + (1 - 0) rX, where© =- ^ ^ (x,'X\ 

{/V(n-l)}' 

where r = y/x;r = (l/n) Z(y,/x : ); and? are sample means of y and*; Xis the 
population mean of* (T.J. Rao 1981). 

Also, compare the efficiency of t MN , t HR and t p for a SRSWOR of size 2 
trom a small population of 4 units. 
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Unit 

x-values 

y-values 


Ui 

u 2 

u 3 

u< 

1 

2 

3 

4 

0.5 

1.2 

2.1 

3.2 


8.4 Lety,, x, (/ = 1,..., N) be the values of positively correlated variatesy and 
x defined on a finite population of size TV. x/ > 0 for every / = 1,.. N. 
Denote p m as the sampling design under Midzuno-Sen sampling scheme, 
p e as the sampling design corresponding to SRSWOR and ph as the 
sampling design where the SRSWOR sample is split at random into two 
independent half samples, using half-sample means yi, yi and xj, x 2 . 
Define the following estimators 


t] - X(y/x ), where X is the population mean ofx and y and x are 
sample means ofy and x respectively. 

*2 - y+ b(X-x), where b is the sample regression coefficient 

ft = r* X, where r* = 2(y/x) - Vz {y\!x\ + yifxi) 


t4 = (X-x) r 

t s =X 


, ( y~rx ) 


\x. 


(«-l) 




= Xr 


(. y-rx) 
(n~ 1 ) 


(n~ 1 ) 
(n-l) 


Compare the biases and efficiencies of the following strategies 
(i) (p m , t\) (ii) (p c , t\) (iii) {p e> t 2 ) (iv) (p e> t s ) (v) (p h , / 3 ) 

(vi) (ph, h) (vii) (p h , t 6 ) (viii) (p h , ti) (ix) (p h> ts) 
under the model where the values y, and x, (/= 1,..., AO fixed fora 
given finite population are supposed to be realizations respectively of 
random variates Y, and X, jointly distributed with Yi = a + (3 Xt + e h where 

a and p are constants and e/ (/ — 1 ,..., N) are random errors distributed 
with 


E(e, | Xt) = 0, E{e? | Xi) = SXfi 
E(ejej | Xj. Xj) = 0 for every i * j. 

Here 0<8<qo, 0<g<2 and the Xi are independently distributed with a 
common gamma density 
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G(0) = [l/r(0)] e in which * > 0, 0 > 2. 

8.5 Derive the approximate expected value and variance of the following 
estimators to 0(l/«). 

i) t\ - Sxy/Sx^ ii) tl ~ $xy / {.Sx- Sy?) 

where s xy , s y and s x are SRSWOR sample covariance between^ and x and 
standard deviation ofy and standard deviation of* respectively. 
Construct almost unbiased estimator (bias of order \/n 2 ) in each case. 
Study their behaviour when fy,x) follow bivariate normal distribution. 

8.6 Derive the bias and mean square errors of the following almost unbiased 
ratio estimators and compare their efficiencies under (i) SRSWOR, (ii) 
bivariate normality of (y, x ), and (iii) super population model assumed in 
Exercise 8.4. 

i) r b = r(l+ 0] Siy lx y)/( 1 + 0, s x 2 /x 2 ), 

ii) r t = r [ 1 + 0i (sxy txy - s x 2 /x 2 )], 

iii) r s = r/{ 1 + 0j {s 2 /x 2 - s xy lx y )}, 
where 0i =(1 In - \/N). 

8.7 For an simple random sample of paired random variables Q( h Y,), (i = 1,.. 

., ri) with E(Xi) — p and E(YI) = r\. Let0 = r)/p the ratio estimator of 0 = r|/p, 

where and are simple mean estimators. Apply jacknife to ratio estimator 
with number of groupsg = 2 in the model F, = a + fix, + where e, are 
independently and identically distributed with either normal or gamma 
distribution. 

Establish the result, neglecting terms of Ofa -4 ), that the jackknife 
estimator has both smaller bias and smaller variance than the simple 
estimator rj/p for the normal distribution and in case of gamma 
distribution with coefficient of variation less than / 4 , the jackknife 
reduces the bias and increases the variance, but reduces the mean square 
error (Durbin, 1959). 

8.8 To estimate the population ratio R defined by the ratio of population of 
mean of y to that of*, let us form the estimator as 

/ = (y/x) (1 + 6\C x 2 )/(\ + 9]C y 2 ), 

where y and * are the sample means ofy and * respectively. 

Compare r with r (= y/x) in terms of biases and mean squares (Swain 
1981). 

8.9 In stratified random sampling consider the situation where a characteristic 
* is closely related to the study variable^ and F=(F, 1 ,F, 2 , Y kNk ) is a 
realization of an //-length random vector with distribution depending onA' 

= (X\\,X\ 2 ,. .., X/cNic ) and some unknown parameters. 
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Ee {Yij | X„) = a + bX,j 
Vq ( Y,J I xij) = a 2 Xjf 
cove ( Yij, Yry, \XyXrp) = 0 

where a, b and g are constants; 1 < g < 2. Show that 

i) For any stratification of the population with simple random sampling 
without replacement in each stratum and with proportional allocation of 
sample size it is necessary that the* values be arranged in increasing order 
of magnitude for Ee V(Y) to be minimum (Reddy, 1976). 

ii) In the sense of expected variance the unstratified 7iPS sampling 
strategy with HT estimator is inferior to stratified u PS strategy with the 
corresponding FTt estimator with 0-optimum allocation. (Rao, 1968). 

8 .10 Show that under Lahiri-Midzuno-Sen sampling design the variance of the 

A 

ratio estimator Y R can be expressed in the form 


N 


N 


V(Y R )='£a ii Y? + I.I.a i jY l Yj 

£ l 


where ay 


Y 1 1 

A I —- \<i<N 


n j 


'N' 

n) 


S3 i xfs 1 ) N- 


and a-y =• 


X 


n‘ 


\n) 


I - V’ 1 Zi*jZN. 

sa /j x(s) N 2 


Show, further that any non-negatice unbiased estimator ) must be of 
the form 


Est. V(Y)=-l'La ij (s)X i X 


( 


KJ 

ijes 


Y Y- 

1 1 1 1 




Ui Xj) 


satisfying the unbiased condition 

I a ij (s)p(s)=a ij , \<i<j<N. 

S3 / 

8 .11 A population is divided into L strata with M being the number of units in 
rth stratum. A samples of size n is selected from the entire population with 

L 

p(s) being proportional to pc h where 3c, is the mean of auxiliary 

i-i 

variable x (assuming that the auxiliary character is known for each unit of 
the population) based on m sample units taken from the / th stratum. Show 
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that Ysi - X(LNiyi)/(LN,Xi ) is an unbiased estimator or population total. 

Derive variance of Y s t and its unbiased estimate. 

8.12 A population consisting of k strata of sizes N\, .. .,Nk a sample of size 
/?/, (« = 1, . . ., k ) is to be selected from the h th stratum. Consider a 
sampling method where one unit from the population is selected with 
probability proportional to*. If the selected unit belongs to h th stratum, 
then a simple random sample of m - 1 units is selected from the remaining 
Nh - 1 units in that stratum. From other strata srs of predetermined sizes 
are selected. Show thatE^^/SA^/, is an unbised estimate of Y/X. Derive 
its variance. 


8.13 If the regression of y on * is linear, show whether in simple random 
sampling y/x. would have smaller large sample variance than that of the 
Hartley-Ross ratio type estimator. 

8.14 If the regression of_y on x is curvilinear, through the origin, give your 
comments on ratio method and regression methods of estimation. 

8.15 Derive the conditions under which the cluster sampling in conjunction 
with ratio method of estimation would provide a more efficient estimate 
of population mean as compared to simple random sampling in 
conjunction with ratio method of estimation (Mishro and Sukhatme, 
1972). 


8.16 Consider a population of N first stage units, j th of which consists Mj 
second stage units. Let yjh and Xjh (h= 1,..., Mj;j= 1,..., AO be the values 
ofjy and x characteristics corresponding to h th second stage unit of they'th 
first stage unit. Now consider the following sampling scheme: 

Draw one second stage unit from the whole population of ssu’s with pps to 
*, and if it happens to belong to j th fsu, select (mj - 1) ssu from the 
remaining {M,- 1) ssu’s by SRSWOR. Further, draw the remaining (n- 1) 
fsu’s and their sub-samples using SRSWOR. Suggest an unbised 

N Mj I N Mj 

estimator of R- IIWIZ X jh and find its variance. 

/ =1 h =1 / j = 1 h=\ 

8.17 Let U = (1, 2, ... AO be a finite population of N units, partitioned into/* 
categories^i,.. .A r according to variate* and partitioned into5 categories 
B u ... B s according to variate y. Let Ny be the number of population units 

in the cell Aj D Bj (i — L • • • r \ j ~ 1- • • s )• D e fi ne Nj. - 

]\f. N j 

Y hj.. ,V . =Y Nn , Pi = — L and P i =-. 

y y » yv .y '■ ■' yy 

Let a SRSWOR of n units be drawn from the population and let/7, j, n\ and 
n, be the total number of units falling in the CelM, D B Jt the class At and 
the class B, respetively. Calculatep,. = njn andp.^ = njn. 

* Compare the following estimators as regards their efficiency 
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n 


■ j 


A 

ii) Pj = p.j - h (pi. - Pi), assuming Pi. as known, where X, is suitably 
chosen constant For r = s = 2, compare the following estimators: 

P A = P.\ ~(P .i ~ p .\ )(Wynn, 1976) 

\P\M\ 


P\ = 


«i. 


+ (1-Pj (HyettandMckenzie, 1977). 
* 2 . 


8.18 Derive an exact expression for the variance of the ratio estimator R under 
the Midzuno-Sen sampling scheme. (T. J. Rao, 1966). 

Also derive conditions for which the suggested variance estimator is 
positive definite (T. J. Rao, 1977). 

8.19 Compare the efficiencies of strategies involving the ratio estimators based 
on Midzuno-Sen sampling scheme and SRSWOR scheme when the study 
variate y and the auxiliary variate x follow bivariate normal distribution 
(Singh, 1975). 

8.20 Let there be a finite population consisting# first stage units, i th of which 

— ( N 

consists of Mi second stage units. Let M = — JM,-. To estimate the 

\NJ 

population mean Y, a simple random sample (WOR) of a n first stage units 
is selected from N fsu’s. From ith fsu in the_sample, a srs of A/, second 
stage units is again selected. Define y,. and Y L , as the sample mean and 
population mean of ith first stage unit. Letjc,y be the value of the auxiliary 
variable corresponding to y tJ and_quantities 3c,. and X,. are defined as 
before. Let r, = y,./x,. and u, = Mi/M. 

Consider the chain ratio estimator (Murthy, 1967) 


Ycr - 


Y*uiXi.n ±UiX, 


\i= 1 


j = l 


X 


being the population mean ofx‘s in the population. 

Derive approximate bias and mean square error of Y CR and compare its 




X, where r^R) is the 


efficiency with ¥{„ =1 ^ u,X L r KHR) / £ u,X, 

v/-i / i=l j 

Hartley-Ross type estimator corresponding to n. 

8.21 Consider a stratified cluster sampling wherem/, clusters are selected with 
equal probability without replacement from N\h clusters in the h the 
stratum. On all elements in the selected clusters variables y and x are 
measured. Suggest suitable estimators when clusters in each stratum are 
(1) of same sizes and (ii) of varying sizes for estimating the population 
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ratio R =Y/X. Find the corresponding variance estimators. 

8.22 Consider a probability sampling scheme of selecting n units out of a 
population consisting of//units in such a way that the first unit is selected 
probability proportional to size variable x and the remaining (n - 1) units 
circular systematically with the unit selected first as the random start and 
with a sampling interval. Show that for such a selection procedure y/x is 
an unbiased estimator of R. 

A 

8.23 Let S 2 yr = s y 2 (S x 2 !s x 2 ) define the ratio estimator of S y 2 in the case of one 
auxiliary variable, where s y 2 and s x 2 are the sample variances andS), 2 and 
S x 2 are the finite population variances of y and x. Consider two other 
estimators of S y 2 as 

S 2 yreg ~ S y 2 + d(S 2 — S x 2 \ S 2 yreg ~ S y 2 + d\ {S 2 — S* 2 ) + dj (S xy — S xy ), 

where d, d\ and di are either known constants or optimum values of the 
constants obtained by minimizing corresponding mean square errors. 

A /V A 

Compare large sample variances ofS> 2 , S 2 yreg and S 2 * yreg . Further, in case 
of ^-auxiliary variables, compare the the efficiency of 




where W 's are optimum weights and S j mreg = s 2 + ^ d t (s 2 - s 2 

1 = 1 

where d's are optimum constants. 

8.24 For estimating the mean yield (y) of rice per plant, two auxiliary 
characters-number of tillers (xi) and percentage of sterility fo) are taken. 
For a population consisting of 50 rice plants (variety C.H. 45) sown at 
Central Rice Research Institute, Cuttack, 1979 different values ofy, x\ and 


X 2 are given below. 


SI. 

No. 

y 

Xi 

*2 

SI. 

No 

y 


x 2 

1 

14.80 

8 

17.0 

26 

14.80 

10 

16.0 

2 

29.00 

15 

15.1 

27 

12.00 

9 

18.6 

3 

12.00 

9 

19.8 

28 

15.55 

11 

19.0 

4 

15.55 

10 

19.1 

29 

12.35 

9 

20.1 

5 

6.00 

6 

18.0 

30 

16.30 

11 

16.0 

6 

12.35 

8 

19.0 

31 

14.85 

10 

18.9 

7 

16.30 

11 

16.3 

32 

11.25 

10 

21.0 

8 

3.15 

8 

20.5 

33 

12.50 

10 

20.2 

9 

6.30 

4 

15.4 

34 

14.05 

9 

17.3 

10 

7.05 

7 

19.8 

35 

10.70 

7 

19.1 
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11 

5.40 

5 

18.3 

36 

14.65 

10 

18.9 

12 

7.70 

9 

19.4 

37 

15.25 

11 

20.5 

13 

4.95 

7 

20.4 

38 

13.90 

7 

18.6 

14 

3.45 

2 

19.1 

39 

11.40 

8 

19.3 

15 

8.00 

13 

19.4 

40 

15.40 

11 

20.3 

16 

14.85 

10 

14.8 

41 

11.75 

8 

18.7 

17 

21.20 

11 

21.0 

42 

13.05 

7 

15.6 

18 

10.65 

4 

18.6 

43 

15.0 

8 

19.5 

19 

20.50 

12 

18.3 

44 

18.30 

11 

20.0 

20 

11.85 

9 

21.1 

45 

12.10 

9 

21.4 

21 

5.00 

10 

21.2 

46 

18.15 

10 

16.0 

22 

6.05 

7 

15.0 

47 

12.15 

7 

19.1 

23 

21.80 

11 

18.8 

48 

13.00 

9 

20.6 

24 

22.00 

13 

16.8 

49 

13.60 

11 

21.1 

25 

12.50 

10 

20.0 

50 

11.05 

10 

20.1 


Compare the efficiencies of the following estimators: 

i) Simple mean per unit estimator y 

ii) Ratio-cum-product [(y/3q) xi] (X\/X 2 ). 

iii) Olkin type weighted estimator 

W{yft\)X\ + (1 - W) yxifXi 

iv) Weighted regression estimator 

W[j + b m {X , -x, )] + (!-W)[y + b yX2 (X 2 -x 2 )] 


v) Multivariate regression estimator to estimate the mean yield of rice per 
plant for a SRSWOR of size n < 50. 


8.25 Show that Under Super Population Model: E(Yi\Xi) = a + $X it V(Yi\X,) = 
aXF and cov (Y,, Yj\X h Xj) = 0 (i * j), the expected variance of the ratio 


estimator under Midzuno scheme of sampling is more than that of 
Horvitz-Thompson estimator withTt, proportional to X h for g > 1. 

8.26 Compare the following estimators of population mean? in the presence of 
a single auxiliary variable* with respect to bias and mean square error for 
a SRSWOR of size n drawn from a population consisting ofN units. 


i )h=y 




(!+/)-/! 


; ii) t 2 = y 


( 1 +/)-/ 


-1 
V x ) 


iii) h~y 


/+(!-/) 



; iv) t 4 =y 


(X-Jx) 
(1 ~f)X 


v) i [/y + _ ( i-2/)x] 

(X-jx) 


, where / = n/N. 


9. TWO-PHASE SAMPLING 


9.1 Introduction 

When an estimator of a parametric function is constructed with the knolwedge 
of advance information on some other related parameters, a question is posed as 
to what happens in case such information is lacking. Moreover, good guess 
values for the unknown parameters may not also be available. In this situation it 
sometimes pays to take a preliminary large sample, if the cost of such sampling 
is not too expensive to get some good estimates of concerned parameters. Next, 
a sub-sample from the preliminary large sample or an independent sample from 
the remaining part of the population is taken to observe the study variatey. This 
procedure of taking a sample at two phases is known as two-phase sampling or 
double sampling. Suppose it is desired to take a large sample s' of size ri from 
the finite population U = (1,. . ., AO of N units to estimate certain parametric 
function concerning one or more auxiliary variables or to estimate strata 
weights in case of a stratified random sampling, with probability/? 1 ( 5 ') and a 
sub-sample s of size n is drawn from N units with a conditional probability pi 
(4?') and values of the variate of interest^ is ascertained on the units of s, 
then the over all sample/ = (/, s) has the probability/?(/) =p\ {s')pi ( 5 1 s'). 
Thus, we have sampling designs/?i, pi and p corresponding to first phase, 
second phase and over all sample. If second phase sample^) is drawn from U - 
s', then /?(/) = p\{s') pi (s | U - s'). Ghosh (1963) considers a two-phase 
sampling where the first phase is used to form clusters for the second phase 
sampling. Neyman (1938), Bose (1943, 1951) and Cox (1952) are among early 
users of double sampling. 

Sarndal and Swensson (1987) have given a general approach to two-phase 
estimators based on ^-estimators with emphasis on regression estimation. 
Chaudhuri and Adhikary (1983a) have extended certain single sampling results 
concerning non-existence of UMV estimators and optimality properties of 
estimators to double sampling context. They have also shown the admissibility 
of an estimator in double sampling with varying probabilities following the 
work of Rao and Bellhouse (1978). 

Define the first phase inclusion probabilities as 

K = = £ p\{s’) 

s'ii 
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PutD'y — n'ij-n'i 7 i 'j. Assume that >0 for all / and n'y>0 all /*/. In 
the second phase sample s(s c s') of size n is drawn from s with /?2 (s | s ) as the 
conditional probability of choosings given s'. 

The conditional inclusion probabilities are 

*(/. = 

SB I 

and 

7ty/j'= 

S*i\j 

assuming that for any s', 7 i, lV ' > 0 for all i , s' and ny s' > 0 for all i* j , s'. Put Dy S ' 

= 71// s* — 71/ .v' 71/ s' 

Now, for all i, j e s' and for any s'; denote 

n,* = n'i 7 i, .v' 7i*,/ = n'i/ Tiij s' and D*y = n y - n*, n* t 

Define Y', = YJn,', and Y", = T, / 7t/. V ' = f//7t*, 

Thus, in two phase sampling a design unbiased estimator of population 
total Y is given by 

y’ =£r,-/7t; 

i es 

and the design variance is 

V{Yx) = ZS D'ij r, Yj + E\ pz Dij v Y"< Y'j], 

U s' 

E\ denotes expectation with respect to sampling design in first phase. 

/\ 

An unbiased estimator of V(Y* ) is 

v(Yn) = ZL, c D'y / n'ij) (Y'i Y', / ny ,-) + ZZ, (Dy s' / ny*) Y", Y"j 
( D* \ 

= ZZ -f k/t; 

5 v 71 y 

9.2 Two-phase Sampling for Stratification 

The stratified random sampling presupposes advance knowledge of strata 
weights and the sampling frame within strata. In many practical situations strata 
weights are known but the frame within strata is not available. This problem 
may be solved by the technique of post stratification. If strata weights are not 
known exactly, one may take a large sample ri from the population to 
determine the estimates of strata weights. Further, the information on 
stratification variable may not be readily available and it, therefore, may 
necessitate to divert part of resources to collect information one or more 
auxiliary variables from a preliminary large samples for the purpose of 
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.. as we i, as for use in the method of estimation. Suppose a 
-"of size N is to be stratified according to the values of an auxiliary 
vTable X but the frequency distribution of* is unknown and hence the strata 

weights Wt - WW (A = 1.*). In this situation we take a preliminary large 

sample 5 ' of size ri by SRSWOR and observe x alone and the selected units are 
classified into k strata according to x. Let s h contain n h units falling into 
stratum h(h = 1,..*, L n'h = «')■ A sub-sample s h of size m is drawn from ri h 
units falling into h th stratum by SRSWOR independently for each/j and the 
main character^ is observed. Thus, m = v h n! h where 0 <vh <1 with v h being 
chosen in advance. Let y h and sh 2 denote the sample mean ofy-values and 
sample estimator of Sh 2 based on sample of size m. 

The usual stratified estimator is 


ys,-d = 'Z^-yi,='L w hyh, 

hi\ " k-i 


where wh = n'h/n'. It may be mentioned here that Wh, m and yh are all random 
variables. «/, is bounded above by the random variable rih which varies from 0 
to min (ri, Nh). Singh and Singh (1965) formulated three sub-sampling 
procedures devoid of inconsisstency such as (i )rih with replacement, all units 
used in the estimator (ii) the same but with only distinct units and (iii) min ( 7 '/,, 
nh) without replacement. We assume here thatw' is so large that the probability 
that any n'h taking the value either 1 or zero is negligible. In case such a 
situation arises we may increase ri until this condition is satisfied or merge the 
strata such that all rih exceed 1. 

Denoting E\ and Ei as expectations over first and second phase samples 
respectively and V\ and V2 as variances with respect to first phase and second 
phase samples respectively, we have 

k k 

E ^ w h y h =E ] ^ w h E 2 | w/,) 

_h =1 J L*-i 

= £, Ys Wh ^ h = H W ^h = y, where E(w h ) = W h , 

h= 1 h =1 

_ -J 

thus showing £ why h is an unbiased estimate of the population mean Y. 

To calculate the sampling variance of jW a treatment due to J.N.K. Rao 
(1973) follows: 

V(yst-ci) = E\ V2 (y.u-cj) + V\ £2 (yst-d) 

= E\ [I wh 2 V2 (y/,)] + V\ [£ wh Yh]. 

Using the results, V(w h ) = (g'/ri) W h ( 1 - Wh) and 
cov (wh, w t ) = - (g'/ri) W h Wt (h * t). 
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where g' = (N-n')l(N- l)and again V 2 (*) = l/N h )SS, we gel 




n'v h 


, B 

+ g — 

N k S ri 


where Ah = £<V) and B = m h (Y h - Y ) 2 . 
On simplification, 


)= I^S 2 


'1 iWg' 


Vn* Wj 


+ 


KNn 


7 


+ 


V 


v 7 )2 >a(>W) 2 


By using the identity 

(yv-i) 


N 


5 2 =EH / *(f ( ,-y) 2 +x(^ -^Js 2 , 


an alternative expression io<cV(y sl _ d ) may be written as 


\n N. 


( 


I w »s\\ 


ri 


'-L-i' 


For proportional allocation with v/, - n/ri and ignoring terms with nh/Nh, we get 


v P (y st - d ) = 


, ( g 'B) 


n 


ri 


Following term by term estimation, a non-negative unbiased estimate ofK( y st A 
is 




(ri -1 )N 


£ W A 


i i 


\ 


\n'v h N) 

+ fr£ ,t '*(3'* -y«-d) 


+-£**( ——- 

" ,Zw ‘U nv*; 


Alternatively, 


K (3^-rf) = 


/V-l 

N 


( 


\ 


n'h -1 Wfi-1 

ln'-i yv-ij 


w h st N -n' ^ _ 

- + --J'jz-rf ) 2 - 


n h N(n'-l) 


To find optimum values ofand v/, = nh/nh' we minimize for a 

given cost function, say, 

C = c'ri + Zch nh .* 

Since the n h are random variables, C is random and hence we infact use 
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expected cost 

C* = E{C ) = dm! + Lch E(rih) = dm' + m'Hch v/, Wh 
Using Cauchy-Schwartz inequality we have optimum allocation given by 


Vi, =S 


{ c h (S 2 -!^ 2 } 


, (/z^ 1,. . k), 


provided v/, < 1 for all h. 

The optimum value ofn' is obtained by substituting v/, in the expected cost 
function and then solving forn'. Thus, V op t ( y s t-d ) 

( \ \ r - _ C2 

I r\ ’ TJT/ P / i / W) rrr 1/ / i 


p-j [£»'/, 5/, 77 + (.S 2 - 2W^,S>, 2 ) ,/ = V?] 2 - 


If the strata weights are not accurately known, Rao (1973) following 
Srinath (1971) suggested a different sub-sampling procedure to choose 

n' 2 

n h = -7—— (/»= 1.^), 

where ot/j > 0 is a constant fixed in advance. Such a procedure guarantees 
desired precision at a slightly higher cost than the optimum and is thought to be 
more robust against poor guesses at the Wh. 

In this situation the sub-sampling fraction rih/n'h varies according to the 
observed value of n'h. Thus, 


1 1 


V{ys,-d)=\~i;\S*+E \S 

\ n N' " ** ' 


= ^ — -— 1 S 2 + 

\n' NJ 


1 1 


«/, n h 




which is independent of the weights Wh. The expected cost under Srinath’s 
rule is 


E(C) = n'd + I t E 


(a h n' + n h ) 


c h = n'd + Z 


n'W 2 c h 
(a ,+ fV h ) 


The optimal a h is given by 

\s 2 -i,w h si)y2 

<*h = i—-—7—-— 

S h Vc' 


Wjc h -W h , (//=!,..., k) 


Ige and Tripathy (1987) have suggested alternative sampling strategies 
based on two-phase stratification, utilizing the auxiliary information obtained 
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on the first phase sample both at the designing as well as at estimation stages. 

9.3 Two-phase Ratio Estimator 

In the ratio method of estimation of the population mean or total of main 
character y, it is pre-supposed that the population mean^or population totals 
is known in advance. In case this is not so, one may use a two-phase sampling in 
which the first sample (s') which is usually a large one, of size;?' is^elected with 
a probability sampling design to obtain a good estimated' of X and then a 
probability sample ( s ) of size n either out of ri or from the remaining^- ri units 
of the population to observe the main character^ under study. 

Assuming SRSWOR in both the phases we have the two phase sample 
estimate of the population mean given by 

y r d = ( y/x)x ' 

Following usual Binomial or Taylor series expansion of (y/x)x', we have, 
to first order of approximation, the approximate bias 

b, (y rd )= f(-— f) (Cl -pc x c y ), 

1 \n n J 

I 

which may be negligible for« sufficiently large. The exact bias is given by 

B{y r d) = E(y r d ) -Y = -E\ [cov (y/x,x \ /?')], 

where E\ is over samples of size ri. Bias vanishes if the relationship between^ 
and x is a straight line passing through the origin, y r d becomes an unbiased 
estimator of Y. 

Now y rd -Y= ||]J-F +!(*'-*) 

[_vxy J x 

\ X J \xJ 

= (y-Rx) + R(x'-X). 

If first preliminary large sample and the second sample are drawn 
independently 

ny rd ) = -ws yx + R>sn + (1_±W. 

[n ( N-ri )J \ri NJ 

In case second sample is a sub-sample of the first, 

V(y rd )=E i vAh) +v x e 2 [x' 

\X J \ \xJ 
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1 [S/ - 2 R Sty + M ?)+ (£ ■- V 


\n n 


In both the cases V(y rd ) is of the form VM+V/ri neglecting finite 

population correction factor. , . 

Let the cost function be of the form C = cn + c n , where c and c are costs 
of observing units in the first phase and second phase respectively. The 
optimum values of ri and n can be obtained by minimizing V{y rd ) for fixed C 
and are given by the relation 

,— , r— (cn + c'ri) 

n/JV = n’/VFt = v —7=-= 

rVcc 7 (V^ + #t r )l 


Thus, 




/C. 


P.S.R.S. Rao (1975a), while discussing two phase ratio estimators with 
two independent samples ri and n drawn from a finite population has compared 
the estimators 


(a) t r j = (y/x)x' (b) t P rd = (y/x)x v and (c) t\ d = (y/x)x\ 


where y, x are sample means ofy and x based on n observations, x' is the sample 
mean of x based on ri observations, x v is the sample mean of* based on v 
distinct units in the combined sample of ri and n observations, where the 
number of distinct units v = ri + n-d, d being the number of common units in 
the combined sample, and x* = a x! + bx , where a and b are suitably chosen 
constants, minimizing V(x*), subject to a + b = 1. 

It has been shown by Rao (1975a) that (i) U rd and t% d have smaller mean 
square error than t rd if p < Cx/C y and (ii) t vrd has smaller mean square error than 
t\d ifp >/ 2 (Cx/Cy). 

Srivastava (1970) has considered a generalized two phase ratio estimator 
of the form h\ = yix'/y) a , where a is a constant. Another generalized two phase 
ratio estimator given by Walsh (1970) is 



yx' 

[oa + (l-a)F] 


Sahoo and Swain (1989) have compared h\ and hi when a is estimated 
from the second phase sample. Swain (1964) has considered two phase ratio 
estimator when the sampling design is a systematic sampling. Mishra and 
Sukhatme (1972) have studied the efficiency of cluster sampling in conjunction 
with two phase ratio and regression methods of estimation. 
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9.4 Two-phase Regression Estimator 


The regression type estimators of the population mean or total ofy pre-suppose 
advance knowledge of either population mean A" or total X of the auxiliary 
variable *. In the absence of such information a larger sample (s') of size ri is 
drawn to observe* and thereby to estimated, while a sub-sample^) of size n is 
drawn to from s to measure characteristic y. Thus, the two-phase regression 
type estimator of population mean Y is suggested as 

y>eg-d = B(x' -x), 


/V 

where B is the estimate of regression coefficient computed from the sub-sample 
and * is an estimate ofX from the larger preliminary sample. 


Bias ( yrcg-ci) ~ 


= Y - 


N 


(N-2) 


,* 




f 1 

-±) B 

f P 21 

P 30 

\n 

ri) 


P 20 / 


if SRSWOR is adopted in both the phases, showing thereby that the bias is 
negligible if the sample siz tn is sufficiently large and vanishes if the regression 
of v on a is linear. 

The mean square error of y re g-d to first order of approximation is 
MSE(y,^.j) = V(y, C g-d) = E\ V 2 (y r eg-d ) + V\E 2 (y,eg-d) 




1-1 

n ri J 


S?(l-P 2 ) + 


\ri NJ 


s 2 v 



A 

r P ^n 

• + 

■ 

J 

l n' J 


Note: Vrcg-d is always more efficient than y ni in large samples unless R=B and is 
also more efficient than the simple expansion estimator^. 

Assuming an infinite linear regression model, Cochran (1953) has 
examined the conditional bias of y re g.j, in repeated samples in which*,‘s are 

lxed both in first and second samples. If the bias is not appreciable the 
conditional variance of y,-eg-d is 


K(y ng - t ,)=si( 1- P 2) 


l 

- + 
n 


(x' -X) 2 

£(*, -x) 2 


+ B 2 (x'-X) 2 . 


However, to compare with other methods of sampling, the average variance of 

>>"■' under the assumptions that the first sample and second sub-sample are 
drawn at random and* is normally distributed is given by 
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sf. ( 1 -p 2 ) 

1 

— + 

<i - 1 ) 

n 

^n n / 

fs 2 (l-p 2 ) 


- - N 

P 2 5? 

— 

> + < 

n 


ri 

< * 


1 


{(n- 3) 


B 2 S 2 


If second sample is drawn independently of the first, V(y r eg-d) reduces to a 
form where(l /n - 1/«') is replaced by (l/« + l/«'). Neglecting f.p.c. and to 
first order of approximation, write 

V V 

Y(yreg-d ) = “ + — 
n n 


where V = S/ (1 — p 2 ) and V = p 2 S y 2 . 


Assuming, cost = C = n c + ri d, the optimum values of n and ri can be 
obtained in the usual manner by minimizing theViy^g-d) subject to fixed cost C 
= nc + rid. With optimum n and ri. 



[V(l-P 2 )c + pV^] 2 

C 


where p is positive. 

When there is no adjustment for regression, the sample size under SRS is 
determined by no = C/c and thus, V(y) (c/C) S y 2 . 

On comparison the double sampling regression type estimator would 
be more precise than the simple mean without use of auxiliary information if 

v„ r , (3W-,/) < which im P |ies 

c, P 2 

C' > [l-V(l-P 2 )] 2 


or 


P 2 > 


4cc' 


>\2 


(c + d) 


The relative efficiency of double sampling is given by 



J 


If d is smaller than c , there may be considerable gain in efficiency of two phase 

sampling even for moderate values ofp. 

P.S.R.S. Rao (1972) has considered a two-phase regression est,m ^ on 
when two samples are drawn independently and when the re ® r ®? s10 ^ 
coefficient B is known and obtained minimum variance linear un ia se 



Two-phase sampling | 297 


regression estimator for Y. Let x\ and xi be the two independent unbiased 
estimates of X computed from two independent samples sizes n\ and m 
respectively drawn from the finite population. 

Define x* = a\x\ + aixi and tj=yi + B(x* -X2), where a\ _and aj are chosen 
constants with a\ + ai = 1 and yi is an unbiased estimator of Y, computed from 
sample of size m. 

It may be seen that V(tj) is minimized when V(x*) is minimized subject to 
a\ + ai = 1. The optimum a\ and ai are 


a\ 


{N-n 2 )n ] 
(Nn-2n ] n 2 ) 


and 


„ _ (yV-«i )n 2 

“2 - 

(Nn - 2«j n 2 ) 


9 


where n\ + m = n. 


Thus, the optimum regression estimator of Y is 


t 


d(opt) ~ }’2 B ' 




( N-n 2 )Wj 
(Nn - 2 rt] n 2 ) 


K^i -^2) 


J 


with variance 


opt) )- 


N -n 2 
Nn 2 


1 (^-»i)"i p , 

Nn - 2n x n 2 



If the units of two independent samples are recorded and the number of 
common units d and the number of distinct units v = n\ + ri 2 - dare identified, 

the regression estimate with v distinct units suggested by P.S.R.S. Rao (1972) 
is 


tvd = yi + B(Xv - X2\ 

V 

where x v = £*/ / v. It may be verified that V(t vc ,) < V opi (tu). 

9.5 Two-phase Sampling with Multi-auxiliary Variables 

Two-phase sampling with single auxiliary variable can be extended to 
multivariate case following the work of Raj (1965a), Khan and Tripathi (1967) 
and Tripathi (1976). Olkin’s (1958) weighted ratio estimator, Shukla’s (1965) 
multivariate regression estimator and Singh’s (1967a) ratio-cum-product 
estimators can be easily worked out for two-phase sampling. 

They all belong to a class of estimators of the population mean? of the 
main character y defined as 

(l i-^)}, 

where (i)[I, is an unbiased estimator of from the second sample, (ii)|I* = (p*i,.. 
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lV)' andp = (|Ii,..where p** andp* are estimates of p* (k= 1,.. ^ p \ 
being the population mean of k th auxiliary variable** (k = 1 ,.. .,/?) from first 
phase and second phase samples respectively based on any sampling design, 
(iii) /' = (n, • • t p ) is a suitable chosen vector of statistics such that£(/*) exists 
for all k = 1 

Sarndal and Swenson (1987) have given some general results for 
two-phase sampling with multivariate auxiliary information. 

Assuming that all thep-auxiliary variables are observed on the first phase 
sample and in the second phase an independent sample is chosen to observe 
both y and the same p-auxiliary variables, Srivastava (1981) proposed a 
generalized two phase estimator of population mean which happened to be 
superior to usual regression estimator. Bedi (1985) has extended two-phase 
regression estimator for Y due to P.S.R.S. Rao (1972) when information on 
more than one auxiliary variable is available. 

Sukhatme and Roshal (1959) have extended double sampling technique 
with single auxiliary variable to multi-stage designs for the use of ratio 
methhod of estimation. Sukhatme (1962) made a comparative study of biased 
and unbiased ratio type estimators of population mean of the main variatey in 
two-phase sampling. Goswami and Sukhatme (1965) considered ratio method 
of estimation in multiphase sampling with several auxiliary variables. 

Maity, Das and Tripathi (1989) have presented a class^f estimators based 
on double sampling for estimating population ratio/? = To/Fi of two means To 
and Ki of charactersyo andyi respectively, utilising information onp-auxiliary 
characters *i, * 2 , - . x r collected for the first phase sample. 

There may arise occasions when some of the auxiliary population means 
are known and others are unknown. In this situation we may estimate the 
unknown auxiliary population means with the known ones and then use the 
double sampling method of estimation for the population mean of the main 
character under study. Alternatively, we may use the double sampling 
procedure of estimation of the main character under study by using auxiliary 
variables with unknown means and the single sampling procedure of estimation 
using the known ones and then combine both these estimators linearly in a 
suitable way. Suppose the problem is to estimate the population meanf ofy 
adopting an SRSWOR scheme. Further, suppose that supplementary 
information on two auxiliary variables* and ^correlated withy are available. 
The population meanJf of* is not known, butZofz is_known. Assume thaty, * 
and z are positively measured and further^ > 0 and Z > 0. 

Suppose on the first phase SRSWOR sample.?' of fixed size ri from the 
finite population U consisting of//units, the characteristics (*, z) are measured, 
while an independent samples of fixed size n is drawn from U to observe the 
main charactery and*. Swain (1970, 1973) suggested an estimator of the form 
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Y m =y+b„ [x'-hb' a (Z-z') x ] 

where b„ and b',. are regression coefficients of, on * and x on z respectively 
calculated from second phase and first phase sample respectively;* and z are 
based on s' and y and x are based on 5. ^ 

The large sample mean square error of Vs 


MSE (y (1) ) = 


Vn 


n 

NJ 


Sj 


(l~P 2 yx 


) + 


M 

<n' 


n) 


p 2 y X S 2 y( 1 -Pi), 


where p vv and p xz are simple correlation coefficients between y and z and 
between jc and z respectively. Such an innovation seems to be of value wheny is 
uncorrelated or poorly correlated with z inspite of high correlations p yx and p xz . 

An alternative estimator is 

Y( 2 ) ^ 2 (^ — z'), 

where A-i and X? are predetermined constants or random variables converging in 
probability to constants. 

If the second phase samples is a sub-sample of s' on which y is observed, 
Kirgy era (1984) suggested a regressin-cum-regression estiamtor which is of the 
same form as the one suggested by Swain (1970). A series of alternative 
estimators have been suggested be Mukherjee and Rao (1987). The most 
efficient one is due to Mishra and Rout (1997), who defined it as 

A 

Y ( 3 ) = y+ d x (3c' -x)+d 2 (Z - z) + d 3 (Z - z') 


where d i, d 2 , di are suitably chosen constants, having optimum values 

d\ - Pv.r.z, d 2 ~ Pyz.x and di = (3 y Z — (3_y-jc, 

where p Kr .z and are partial regression coefficients and ( 3 ^ .is the linear 
regression coefficients of y on z. In practice the regressin coefficients are to be 
substituted by their consistent estimates from the sample. Alternatively, a 
weighted estimator may be proposed as 

A 

Y ( 4 ) — to | [y + by X {x + b xz (Z — z ) — ^}] + ( 02 [y + 6 vz (Z — z)] 
where co i + a )2 = 1 . 


9.6 Two-phase Sampling for PPS Estimation 

Situations may arise when it is considered important to select a sample of units 
with probability proportional to size x, information on which is not readily 
available, but could be collected at moderate cost for a fairly large sample. 
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Thus, to take advantage of of PPS sampling for estimating population total T a 
large preliminary sample of size ri may be selected with SRSWOR and size 
measure* is observed on each of these units. Next, a second sample is selected 
from the first phase sample with pps with replacement to observe y. 
Alternatively, one may use the first phase sample to estimate population totaLY 
and an independent sample of size n is selected with pps to observe of using 
Lahiri’s method (1951) in which it is assumed that the upper bound fork’s 
known in advance althoughXis unknown. 

For both the above cases Raj (1964) suggested an unbiased estimate of 
population total ofy 


/V 



d 


HL^Ly A, 

ri n /=1 *,• ’ 


where A 7 is the total for* in the first phase sample. 


Case 1 (The second phase sample is a sub-sample of the first phase sample): 

V(Ypps-d) = E] Vj (Ypps-ci) + V i E2 (Ypps-d) 


Now, £1 V 2 {Ypps-d) = 


N ri- 11 
N -1 ri n 


n v 2 

xf-^-Y 2 
. hxi 


)_.N 

ri- 1 

J N- 1 

ri 


V PP s(Y). 


and 


V,E2(¥ pp ,. d ) = N 2 \^-E]s l ? 

x .ri Ns 


Therefore, 

V(Ypps-d) = 


N 

f(«'-l)"| 

SN- 1). 

v ri ) 




n 


For comparison we write the large sample variance of two-phase or double 
sampling ratio and regression estimators of population total Y as 


V n J V n ) 


and 


n 


V(Yra,-d) = [l- V{Y reg )+ | 


( N 2 


V ri J 


S v 2 . 


Thus, for small sub-sampling fraction double sampling pps estimation will be 
more precise than the double sampling ratio or regression estimator according 
as the pps estimator in single phase sampling is better than the ratio or 
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regression estimators. 

An unbiased estimate of V{Y pps -d) as obtained by Raj (1964) is 


Est. V(Y pps . d ) = 


N 2 X' 2 


n' 2 «(«-!) 


f zL_l 

i n 


f 


£- 


\ 


+ 


N(N-n') 

nn'(n'-l) 


’2 


n y 2 

rp-- 

j x,- «'(«-!) 


J'/ 




-I 7 ' 


T *,y 


x 2 


Case 2 (Sub-sample is independent of the first sample): 


V(Y pps - d )= 


l+l -L—Lini¬ 

n' nJx 2 




w-'+li-jij 


TV 2 R 2 S x 2 . 


The large sample variance of the corresponding ratio estimator in two-phase 
sampling is 


V(Y rd ) = V(Y r ) + 



R 2 & 2 . 


As the coefficient of V(Y pps ) in the expression for V(Y PP s-d) is expected to 
be not far from unity for largethe comparison of Y pps -d with YrcI is similar to 

a a 

the comparison Y pps with Yr. 

Two-phase Sampling with pps Selection and Use of a Second 
Auxiliary Variable 


Raj (1965b) has suggested a method where the first preliminary sample s' of 
size ri is selected with probabilities pi proportional to z, (/ = 1,.. N) and with 
replacement and the character x is measured on s'. In the second phase a 
sub-sample s of size n from s' is selected with equal probabilities without 
replacement on which information ony is selected. The suggested estimator of 
population total Y is 

1 nfApt nfa pi n' V Pi 

with variance 



2 kp&py G px ] > 


N 

a py ~Yj P' 

( = 1 
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where p 


= correlation (y/p, x/p ) and k is a scalar. With optimum value oik. 


Yo P t(Ti)=a 2 px 


_ 1 

n 

n 


— 

— 

P 2 

n 

\n 

ri) 



Srivenkataramana and Tracy (1989) have proposed an alternative scheme 
where the first phase sample s' of size ri is selected with probability 
proportional to z/, that is,/?/ = z,/Z and information on* is obtained as in Raj’s 
(1965b) scheme. In the second phase a sub-sample s of n units is selected with 
probability proportional tox/z with replacement. It is assumed that* and z take 
positive values on each unit in the population. An unbiased estimate under the 
alternative scheme is 


with 

where 





and 


2 _ 
py 


N 

=Z'« 


1=1 


A-k 

ypi . 


* 


The efficiency of 72 as compared to T\, 

«'(l-p 2 ) + «p 2 


E = V op t (7j )/V(T 2 ) = 


n + (n'~ 1)9 


nr(l-p 2 ) + np 2 ~ rj l-p 2 )+p 2 
n\-(nr-Yfd (1 + rG) 


where r = ri! n and Q = V* y /^ 2 py . 

A modification of Srivenkataramana and Tracy (1989) may be suggested 
as follows: 

The first phase sample s' of fixed size ri is drawn from the population of A 
units in such a way that the first unit of s' is selected with probability 
proportional to size variable z and the rest (ri - 1) units are selected from the 
remaining (N-\) units of the population with SRSWOR andx’s are observed 
on these units, (*,,/= 1,, ri). Again in the second phase samples of fixed 
size n, the first unit is selected with probability proportional to* and the rest ( n— 

1 ) units are selected from the remaining (n' - 1) units of the first phase sample 

with SRSWOR to observer In fact in both the phases Midzuno-Sen scheme of 
sampling is implemented. Then, 



Two-phase sampling | 303 


i r 


P(s')=t— rjand p(s/s')= 

\n'J 


1 

X 

V 1 

F 

V n) 



A design unbiased estimate of population total Y may be proposed as 

(Z\ 

V X J V z'J 


h=Ny 


9.7 Sampling on Successive Occasions 
9.7.1 Introduction 

Sample surveys may not necessarily be confined to single time period only. 
When the population characteristics are likely to change rapidly with time, it 
may be of interest (i) to estimate the change in Y from one occasion to another, 
(ii) to estimate Y for the most recent occasion and finally (iii) to estimate the 
average of Y over all occasions. The structure of the sample to meet these 
requirements may have the following possibilities: 

a) the same sample may be used on each occasion, 

b) new sample may be taken on each occcasion, 

c) a part of the sample may be replaced by drawing it afresh, which brings 
about a further problem of determining optimal replacement policy. 

Suppose the total sample size of the sample on each occasion is to be kept 
the same, then for estimating the change, it is advisable to retain the same 
sample throughout all occasions and for estimating the average Y over all 
occasions, it is best to draw a new sample on each occasion and further to 
estimate Y of the current occasion it pays to replace part of the sample on each 
occasion. Replacement policy was first examined by Jessen (1942), who 
considered the problem of sampling on two occasions with partial replacement 
of units. Yates (1949), Patterson (1950) and Tikkiwal (1951) have made 
detailed study of successive sampling and have extended Jessen’s results to 
more than two occasions. 

In case of sampling on two occasions to estimate the current total or mean, 
denote the sample on first occasion (past occasion) assi and sample on second 
occasion as s 2 which consists of matched part s 2m selected from s\ and 
unmatched parts 2 «. The unmatched sample^ may be taken either from U or 
from U - s i or from if - s 2m (which is same as U — s\ m , s\ m being the matched 
part of the first occasion). The sizes ofsi and s 2 are usually equal, say n, but may 
be also different and further, sampling designs adopted forsi and s 2 may be 
either the same or different, depending on situations at hand. 
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As the uniformly minimum variance unbiased estimate is not available in 
,he finite population context, it is advisable to suggest optimal and practically 
easonable estimators in sampling on repeated occasions for various c.rcum- 


stances. 

9.7.2 Jessen's Estimator (SRSWR on Both Occasions) 

Suppose n units are chosen from* units oft/with SRSWR and the sample size 
is kept fixed on both occasions. Out ofn units insi on first occasion, m units are 
selected for observation in the second (current) occasion and independen 
sample of u units drawn independently for the current occasion, such thatn m 
= „ Lety be the character under study ando 2 be the variance ofy assumed to be 
the same for both the occasions. The estimate of population mean Y , for the tirst 

occasion is 


.yi = 



Two different estimates of population mean Yi can be built for the second 
occasion- one from the matched part ofm units in the first occasion and the 
other from the unmatched part ofn units. Thus, from the unmatched part we 

have an estimator of Yi as 


yiu 



with variances 


a 2 1 , x 

v(yi»)=— =ir (say) - 

u W u 

As regards the matched part, assuming observations on the first occasion 
as ancillary information for observation for the current occasion, we have a 
double sampling regression estimate Yi as 


where y 2m 


y 2 m(reg) = Y2m + b (y\ 

= ±=YY 2h y ]m =-Y J Y ] i and b = sample regression coefficient 

m ^ *’ m m* 

m Sim 5l "' 


of ^2 onyi. 


f ( yim(reg) ) — ^ 


,„2(hPi) + £!zi= 1 


m 


n 


W, 


(say), 


m 


Combining two independent estimates yi u and yim{reg) by weighting them 
with inverse of their variances, we have an estimator due to Jessen (1942,1978) 
given by 
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with 


W u y2u + W'm y2m(reg) 



The optimum value ofw or the optimum fraction to be replaced £) = ( u/n ) 
is found by minimizing V(yz w ) with respect to u and this gives 

Qopl ~{i+V(i-p 2 )} ’ 

which gives P, lp , = (— ] = - - ^~ P l (since P + Q= 1). 

VnJ o P , {1+V(1-P 2 )} 

The minimum variance is found by substituting the optimum^ in the variance 
expression for V(yi w ). Thus, 

Vmin (yiw) = f —— [ 1 + )] 

V 2 nj 

The efficiency of the combined estimator yi w relative to no matching is 

2 

RE = - - L - , 

(i+Vo-p 2 )) 

In the following the optimum matching fraction and percent gain in 
efficiency [lOO(RE-l)] are computed for various values ofp (Jessen, 1978). 


P _ 0.5 0.6 0.7 0.8 0.9 0,95 1.0 

Optimum matching 

fraction 0.46 0.44 0.42 0.38 0.30 0.24 0 

Percent gain in 

efficiency _ 7 11 17 25 39 52 10 0 

The extension to multioccasion case has been studied extensively by 
Yates (1949) and Patterson (1950). 

9.7.3 Generalized Linear Estimator 

Another technique to obtain weighted estimates off 2 , Y 2 -Y\ and Y\ + Y 2 due to 
B.J. Tepping (referred by Hansen, Hurwitz and Madow, 1953) under the 
infinite population context is by the writing the generalized linear estimate in 
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the form 

y w = ay\ u + b y\ m + cyi u + d yi m 

In order that y w is an unbiased estimate of ? 2 , we must have a + b = 0 and 
c + d = 1, which gives 

y w = a ( y\ u - y\ m ) + c yi u + (1 - c) yim 


a and c are determined by minimizing V(y w ) with respect to a and c and then 
with optimum values of a and c, y w reduces to 



P PQ 

(1-0 2 P 2 ) 


(y\ u -y\m)+ 


(i-0 2 p 2 ) 


yim + 


Q( i-gp 2 ) 

i(i-e 2 p 2 ) 



On the assumption that the population variance of the variate y on both the 
occasions is the same and is equal to a 2 , 

(yw) n (1-2V) 

The value of Q which m inim izes the V(y w ) for a fixed sample size n is given by 

„ _ {i-Va-p 2 )) _ i 

y<>pt - - - - . 

p 2 i+Vo-p 2 ) 

Thus, the optimum variance 


Vnu„{y w )= — [l+VCi-P 2 )]’ 
V 2 nj 


which is same as V(yiw) by Jessen’s method. 

If it fs desired to estimate change of means Yi - Yi between two occasions, 
a better weighted unbiased estimate of Yi - Y\ may be derived from the 

generalized linear estimate as 

— —| > 

Dw = .. P „ ■ (y 2 m - yim )+ • ^ -(hu - Pin ). 

L(i-2p)J L(1-GP)J 


with 


V(D W ) = 


{2(1- P )j 2 } 

{"(1-2P)} 


It may be seen that further minimization of V(D W ) gives the optimum value of£? 
as 0, that is, the variance of D w would be least if the same sample is retained on 
both occasions and is given by V m j n ( D w ) = 2 ct 2 (1 - p )/n. 

To estimate Y\ + Yj, the sum of means for the past occasion and the current 
occasion, a better unbiased estimate would be 
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Z = 


,(1 +Gp). 

whose variance is 


(Tim + yim ) + 


0(1 +P) 

(l + 0p) 


(ziu + yin \ 


V(Z W ) = 


{2(l + p)o 2 } 

(«(1 + 0P)} 


which is minimum for0 = 1. Thus, V m m (Z w ) - 


2a 


n 


Note : The optimum estimators discussed above depend on population cor¬ 
relation coefficient p, which may be replaced by some past value or an 
independently estimated value or by the value ofp estimated from the sample 
itself. 

When p is estimated from the sample, the estimators no longer remain 
unbiased. However the estimated weights are consistent. 

9.7.4 SRSWOR on Both Occasions 

In case SRSWOR design is adopted on both the occasions 


ny 2u )= 


( 


1 1 


A 


\u NJ 


s \ 


' N \ 
yN-b 


and V(yi m f rcK i) = {— --1 S, 2 (1-p 2 )+ - — -Is, 2 , where S 2 2 = 

\m nJ \n NJ 

where a 2 2 is the variance of y for the second occasion. Further, the variance of 
the combined weighted estimator y 2 W of Y 2 , noting that 

S 2 

CO\(y2u, yim(reg)) = 


N 


1-(m/«) 2 p 2 n N 

V(y 2 w) when minimized with respect to Q = vln gives 


f opt^yiw) 


'sf 

y2tfj 


{i+Vo-p 2 )}— 

N 


the second term being the result of introduction finite population correction 
factor in the variance formulae of yi u and yimfrc^. 

Note: When the sample sizes on both the occasions are not the same, the 
corresponding formulae for determining weights, optimum fraction of the 
sample on the first occasion to be replaced and the optimum variance can be 
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easily derived. 

9.7.5 Sampling on More than Two Occasions 

When information on a variate is available for more than two occasions, the 
estimate of population mean on the current occasion can be derived by the help 
of a recurrence relation as given by Yates (1949) and studied in detail by 
Patterson (1950) and Tikkiwal (1951). 

9.7.6 Rotation Sampling - Sampling for Time Series 

The idea of replacement in sampling on repeated occasions has given rise to 
concept of rotation sampling in connection with sampling for a time series, 
where a group of units stays in the sample forr occasions, leaves the sample for 
m occasions, comes back to the sample for another r occasions and then leaves 
the sample for m occasions and so on. However, one may also visualize other 
methods of rotation. Then, rotation sampling refers to a organized process of 
eliminating some of the units from the sample and adding new units to the 
sample as time advances. The level of overlap between occasions will depend 
on the objectives of the survey as well as operational constraints which suggests 
how long a unit remains in the sample before it is rotated out. Sometimes in 
socio-economic studies our interest lies in estimating characteristics on each 
occasion, average over a number of occasions and also measuring time trends 
and rotation sampling design offers a technique to meet these requirements. 
Rotation sampling has been successfully used in Canadian Labour Force survey 
conducted by Statistics Canada and U.S. Bureau of Census for the current 
population survey (Hansen, Hurwitz, Nisselson and Steinberg, (1955). 

Hansen, Hurwitz and Madow (1953) suggested alternative estimates for 
the estimate of total and change when a time series of monthly and annual sales 
is available with a particular pattern of sample rotation.^ If the estimate of 
change is studied with the help of ratio of unbiased estimates then the adjusted 
chained estimate of total sales for the /th month is 

_ y \-1 y t 

y< - - -, 

*/-l 

where y, is the estimate of total sales for the current month (/) based on sample 
for /th month andx,_i is the estimate of total sales for (t - 1 )th month based on 

sample for / th month. Alternatively a composite estimate for the month/ is 
formulated as 

y"=W, ^zllL + w 2 y,, 

' X ,_1 

where W\ + W 2 = \. 
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Rao and Graham (1964) have developed a unified finite population theory 
for composite estimators of both the current occasion and change in level 
between consecutive occasions when a general rotation sample design is 
employed under the assumption that the exponential and arithmetic correlation 
patterns hold over time for the characteristics of interest. They suggest a 
composite estimator of the population mean To of the current occasion (month) 
for a given characteristic given by 

y'o = Q(y'-\ + yo ,~i -y-\,o) + 0 - Q)yo, 

where Jo is the estimator based on the entire sample for the current month (0), 
Jo ,-1 is the estimator for the current month but based on sample segment 
common to both current and previous month, J_ i,o is the estimator for previous 
month, but based on sample segment common to current and previous months 
and V -1 is the composite estimator for the previous month (- 1). The composite 
estimator of change To - Y-\ is 7o - y 

As pointed out by Rao and Graham (1964) the composite estimators take 
advantage of the information obtained on previous occasions as well as the 
information from the current occasions and result in smaller variances for both 
the current estimate and the estimate of change. 

In many practical situations the value of the mean at one time period 
depends strongly on its preceding value. Blight and Scott (1973) exploring this 
concept extended Patterson’s (1950) estimates for the mean of a time dependent 
population by assuming the population mean at time/, say 6, follows a linear 
Markov process with 

d, -/I =A (0,_ i -ju) + ,£/(/ = 2, 3, . . .) 

where fx = E{9,) and ,e , : t = 2, 3,... is an uncorrelated process with zero mean 
and constant variance a 2 . This model takes into account substantial amount of 
variation in a series due to existence of intra class correlation. 

9.7.7 Successive Sampling Using Multi-auxiliary Information 

Sen (1971, 1972, 1973a, b) has developed theory for successive sampling on 
two occasions in the presence of multiauxiliary information when all the 
auxiliary variables are positively correlated with the study variable^; values 
taken by one of the auxiliary variables may be the values ofy on the previous 
occasion. Adhvaryu (1978), using two-phase estimator of Rao and Mudholkar 
(1967), has extended the theory of successive sampling on two occasions when 
some of the auxiliary variables are positively correlated and others are 
negatively correlated. Swain (1978) suggested an improved estimate of the 
mean in the second occasion by taking into account of an improved estimate of 
the mean in the first occasion with help of an auxiliary variable. Chaturvedi and 
Tripathi (1983) have considered the use of multiple auxiliary information to 
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estimate population ratio R on two occasions. 

9.7.8 Successive Sampling with Varying Probabilities 
of Selection 


Raj (1965b) has made use of PPS sampling with replacement to select a sample 
s\ of n clusters on the first occasion. Subsequently, on the second occasion a 
SRS WOR of m = nk clusters is selected from s\ and an independent sample of u 
= np = n-m clusters is selected from the population in the same manner as s\. 
To construct a weighted combination of independent estimators from matched 
and unmatched parts Raj (1965) used double sampling difference estimator 
from the matched part. Thus, the unbiased estimate of population total on the 
second occasion proposed by Raj (1965) is 

tl = (p tlu + (1 -0) t'lm 

where 


t 


in, - 


1 

(n-m) 



y_hi_ 

Pi 



and 


; (A= x, 2) 

n Pi 


with pi, p 2 , . . pn representing the selection probabilities and yh, being the 
value of the study variable for the/th unit in the sample onh (= 1,2)th occasion 
and further 


t'lm = (t2n, ~ t\ m ) + t\n 


Now let s m denote the matched sample and siu denote the unmatched 
sample on the second occasion represented by siu ~ (s im'S 2 u) where s 2 m 
contains m 2 units which come from the matched portion and$ 2 H denotes the 
remaining h units, with m 2 + h = u. Pathak and T. J. Rao (1967) have suggested 
an alternative estimator 


t 


♦ 

1 




+ (1- §)t' 2m . 


where I' denotes the summation over the units insV They have shown that for 
any convex loss function 1*2 does not have greater expected loss than t 2 . 

Lets,,,* denote the set of distinct units in s m . R. Singh (1972) proposed an 
alternative estimator as 
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** 

O 




(n - m) 


I' 

f y2i ] 

, a.’y 2 t) 

+ m 7 - 


l Pi J 

Z Pi 


+ (1- (j))/2w 


where T denotes the summation over the units in s'ju and L" denotes 
summation over the units in s\,. He also showed that for any convex loss 
function ti** does not have greater expected loss than . 

Ghangurde and Rao (1969) have extended Raj’s (1965) results to unequal 
probability without replacement sampling from a finite population when units 
are sampled on two occasions with a time lag. On the first occcasion a sampler 
is selected employing Rao-Hartley-Cochran scheme with availabilty of normed 
size measures and on the second occasion a matched sub sampler selected 
from following SRSWOR and an unmatched sampler is drawn using RHC 
scheme from the entire population. Chotai (1974) modified Ghangurde and 
Rao’s (1969) sampling strategy by selecting the matched sub-sample following 
RHC scheme and investigated their relative efficiencies. Avadhani and 
Sukhatme (1972) selected sampler and5 u (from U-s\)by SRSWOR and used a 
size measure observed in s\ to choose s m by RHC scheme. Their result may be 
extended to the case whens flI is chosen by some IPPS method. Chaudhuri and 
Arnab (1977) considered the estimation of finite population total on the current 
occasion with the help of a successice sampling scheme where on the first 
occasion a smaple of size n is chosen following Midzuno’s (1952) modified 
nPS selection scheme and on the second occasion randomly selected portion#? 
= np of its is retained and a fresh independent sample of size u = nq is chosen 
from the universe again employing Midzuno’s modified^/ 5 *!? selection scheme. 
They have compared the efficiency of their strategy with those of Raj (1965), 
Ghangurde and Rao (1969) and Chotai (1974). 

Further, following the generalized least square approach of Gurney and 
Daly (1965), Chaudhuri and Arnab (1979 b) extended the work of D. Singh 
(1968) and others to obtain sampling strategies which are respectively more 
efficient than the ones suggested by Raj (1965) and Chotai (1974). 

Chaudhuri (1985) has specified an optimal sampling strategy under a 
random permutation model to estimate the current total for a finite population 
with linear non-homogeneous estimators. 

9.7.9 Sampling on Successive Occasions Using 
Multi-stage Design 

Kathuria (1959), D. Singh (1968), D. Singh and Kathuria (1969) have consid¬ 
ered sampling on successive occasions having partial replacement of units and 
using a multi-stage design and have given illustration with the help of data 
obtained from a sample survey conducted by Indian Council of Agricultural 
Research in the Punjab State in 1956-57 to obtain reliable estimates of milk 
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yield of cattle and buffaloes. 

D. Singh and B.D. Singh (1965) extended the theory of successive 
sampling to double sampling for stratification and further extended it to 
multi-stage sampling with an application to survey conducted for estimation of 
coconut production in the state of Assam (India). Abraham et al. (1969), 
applied the method of successive sampling to estimate incidence of pest and 
diseases on rice crop in Cuttack district of Orissa (India). D. Singh and R. Singh 
(1973) have discussed multi purpose surveys on successive occasions. Kathuria 
(1975) has given some estimators in two stage sampling on successive 
occasions with partial matching at both stages. 

Kathuria (1973), and Agarwal and Gupta (1977), Agarwal (1978) have 
investigated the theory of minimum variance estimation within specific class of 
linear unbiased estimators of the population mean on current occasion using a 
two stage sampling design with unequal first stage units and partial replacement 
of units is done at one of the stages only and SRSWOR scheme is used at both 
the stages. Arnab (1980) has proposed a general sampling stategy for 
estimating a finite population total for the current occasion based on a two stage 
sampling scheme adopted on the current and a previous occasion, generalizing 
the sampling schemes of Singh (1968), Abraham, Khosla and Kathuria (1969), 
Singh and Kathuria (1969) and Kathuria (1975). 

Tikkiwal (1964) has generalized Patterson’s (1950) results for uni-stage 
sampling for the study of one variate on successive occasions to the case when 
each of the units is partially enumerated thus giving rise to two-stages of 
sampling on each of the successive occasions. 

9.7.10 Estimation of Mean Vector on Successive Occasions 

In most of the large scale surveys it may be desired to study several characters 
simultaneously, although some of these characters are of prime importance and 
the rest may be relatively less important. The problem of dealing with multiple 
characteristics gives rise to several problems at the planning stage. Suppose we 
are interested in estimating several means simultaneously, we may face 
problems of optimally allocating over all sample size to different strata, 
choosing suitable size measure or measures for selecting units with varying 
probabilities of selection and devising suitable estimation procedures. 

Tripathi and Khattree (1989) have considered optimum estimation of 
finite population mean vector utilizing information on several auxiliary 
variables. Tripathi (1989) has dealt with the problem of estimating the mean 
vector Y( 2 ) on second occasion by using the scheme of partial replacement and 
multiple regression based on auxiliary informationX = (xi,... ,x p )' obtained on 
SRSWOR at a previous occasion. 
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9.7.11 Estimation of Population Mean on the Second 
Occasion for Non-stationary Population 

In the first occasion there are N\ units in the population P\. In the second 
occasion the population Pi constists of Ni units including N c units which 
continue from the first occasions supplemented by Nh units, those which join 
due to births or other reasons. Thus we may write N\ = N c + N<j and Ni = N c + 
Nh, where Nj units are left out after first occasion due to deaths or exit. Lety be 
the characteristic of interest. Y\ and Yi are the population means on two 
occasions. Thus, in case of a non-stationary population with births and deaths, 
we may visualize three domains such as 

C, the continuing population, the set of units which are eligible 
at both occasion i.e. at time t = 1 and t = 2; 

D, the deaths, which are eligible at i = 1, but not at t = 2, and 
B , the births which are eligible at t = 2, but not at t = 1. 

T o estimate Yi we choose a simple random sample ofn units from N\ units 
in the first occasion. Suppose n = n\ + nj. Compute y\ c and y\j sample means 
for the first occasion and yi c for the second occasion. If Y\ is known, estimate 

y = (7i -N d y u ) 

lr (yv, -N d ) 

with Nd either known or estimated from the sample as 

N d =^L. 

n 


Then, 


y 2c = 


7 v > - 
y 2 c 


yy\c j 


Yu- 


Yjh is estimated by yib taking a separate sample nh from Nh. Now, 


Y 2 =\ 


N uy 2 b + ^lr 


' y 2 e V 


\m-Nd+N b ) 


This estimator is due to Konijn (1973). 

If Y\ is not known, a sampling design similar to the one discussed in case 
of fixed population may be adopted. For further sampling designs and methods 
of estimation references may be made to Babiker (1984) and Holt and Skinner 
(1989). 
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NUMERICAL EXAMPLES 


Example 9.1: A simple random sample of 32 fields was selected out of 500 
fields in a region for conducting a crop cutting survey for estimating the 
average yield per hectare of dry paddy. In each selected field a sample cut of 
1/200th of a hectare was taken for harvesting, thus resulting in 32 cuts. For a 
sub-sample of 16 cuts out of 32, the dry weight of paddy was also recorded. The 
data on yield of paddy in kilograms per hectare were as given in Tabler 9E.1. 


Table 9E.1 Green weight and dry weight of paddy. 


SI. No 

Green wt (kg) 

Dry wt (kg) 

SI. No. 

Green wt (kg) 


X 

y 


X 

1 

7.6 

6.9 

17 

3.9 

2 

5.8 

5.3 

18 

5.3 

3 

8.5 

7.9 

19 

6.5 

4 

6.3 

5.7 

20 

8.1 

5 

5.1 

4.7 

21 

3.8 

6 

4.9 

4.6 

22 

6.6 

7 

5.7 

5.1 

23 

5.5 

8 

7.9 

7.2 

24 

3.6 

9 

6.4 

5.9 

25 

4.0 

10 

5.4 

4.9 

26 

5.0 

11 

6.1 

5.6 

27 

5.9 

12 

3.8 

3.4 

28 

4.8 

13 

6.2 

5.7 

29 

6.4 

14 

6.6 

6.0 

30 

5.8 

15 

7.8 

7.3 

31 

5.4 

16 

6.4 

5.8 

32 

7.0 


Estimate the average yield per hectare of dry paddy and its sampling error 
with and without use of auxiliary information (green wt.). Compute the relative 
efficiency of these estimates. 

Solution : 


/V = 500, n' = 32, n- 16 

Let r, and x, denote dry weight and green weight of paddy for the/ the cut. 

r. _ V T i 


92 


n 

~~\6~ 

1 X 

L — 

_ 100.5 _ 

n 

16 

n ' X 

_ 188.1 _ 


= 62812 


n 


32 
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5 V 2 = 1.324, 5.v 2 = 1.5047, Sxy = 1.4083 
R =0.9154, 6 = 0.936, p =0.996 


Double Sampling Ratio Estimate 

Y = [^]x' = f-^-1 (5.8781)= 5.3810 
V3c ) V 100.5/ 

- ( 1 1 ^ * - ( \ \ \ 
Est.f(f,. rf ) = ( 5 2 v +^ 2 j2_2/fc ) +- s 2 

\n n'J y W N J y 


= 0.03125 (0.006) + 0.038727 = 0.0389 

Double Sampling Regression Estimate 


Y reg-d =y+b{x'-x) 

= 5.75 + 0.936 (5.8781 - 6.2812) = 5.3727 

Est. V( y res . d ) = {-- -) sl (1- p 2 ) + (4 - T) i5 

\n n J \n NJ 

= (0.03125) (1.324) (0.004) + 0.038727 
= 0.00017 + 0.038727 = 0.0388 



Double sampling ratio and regression estimates are nearly of equal precision 
and their relative efficiency compared to mean per unit estimate is calculated to 
be approximately 206%. 

Example 9.2: A survey was conducted in Saraibal block of Santhal Parganas in 
Bihar State (India) during Rabi crop season of 1967-68. 499 fields growing 
wheat were selected at random from 5225 fields and the yield of wheat in 
kilogram per acre was recorded by eye estimation. The distribution of the 
numner of fields falling in each of the six strata is given in Table 9E.2. A sub 
sample of the fields was selected from the fields falling in each of 6 strata and 
crop cutting experiments were conducted. The yield rates observed are 
recorded for each of the selected fields and calculated sample means and 
variances are presented in Table 9E.2. 
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Stratum 
kg(acre) 

100-200 
201-300 
301-400 
401-500 
501-600 
Above 600 


n h 


40 

46 

22 

15 

6 

2 


n'h 


154 

189 

91 

40 

13 

12 


W h - 
= n'h/ri 

0.3086 

0.3788 

0.1824 

0.0802 

0.0260 

0.0240 


Yh 

209 

263 

308 

588 

586 

528 


Sh 


12919 

14229 

9306 

29092 

34024 

67712 


Obtain an estimate of average yield of wheat in kg/acre for the block and 

also the standard error of the estimate. 

Solution : An unbiased estimate of the population mean (average yield of 

wheat) is given by 

T’v,-</ = Sw/, yh = 295.3666 = 295 kg/acre 


k(y s ,_,/ )= 


N- 1^1 

f n'h -1 n h - T 

W h s\ N-n 

+ 

N 

N-\) 

n h N(n- 1) 


Y, w h(yh-ys, ) 2 


= 119.6762 + 23.5635 = 143.2397 
Standard error 

(ysi-u) = V 143.2397 = 11.9682 = 12 kg/acre. 

Example 9.3: The data in Table 9E.3 show the stratification of all 1852 farms in 
a country by farm size and the average acres under corn (maize) per farm in 
each stratum. 


Table 9 E.3 


Farm size 

Number of 

Averages acres 

Standard 

W h = N h IN 

(acres) 

farms (N h ) 

under corn ( Y h ) 

deviation (S h ) 


0-80 

855 

11.27 

5.4 

0.4616 

81-160 

715 

29.06 

16.2 

0.3861 

161-240 

282 

45.30 

25.5 

0.1523 


Suppose that C* - 100, C/, - 1 - C (say). Further, let C' be the cost per 
farm of classifying farms into strata, andC* be the expected cost 

i) Calculate the values of C'/C for which double sampling would bring an 
increase in precision. 

ii) Determine the optimum double sampling plan ifC' = C/100 and calculate 
the optimum sampling variance. 
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i) 


Solution : 

If double sampling is not adopted, a single simple random sampler = 100 
is to be selected, giving V(y srs )= [(N - n)/ Nn]S 2 . With N= 1852, n = 
100 and S 2 = 366.8097 = 367 


V{y m )= 


(1852-100) 

(1852x100) 


x 367= 3.47 


We calculate 

LfV/j Sh = 12.63, 

Sr - 'LWh Sh 2 = 367 - 214 = 153. 

Vop, ( y,,.,i ) = p- p#'* Sa VCa + (S 2 - ZWh Si, 2 )'/2 VC'] 2 - &/N 

= _L[1263 + (153)^ VC '] 2 - — 

100 1852 

= 0.01 [12.63 + 12.37 VC'] 2 - 0.1981 

In order that V ()p i (y x/ </) is less than V(y xr . s ) = 3.47, we should haveC' < 0.28, that 
is, (C/C) < 7/25. 

ii) If (C'/C) < 1/100, we have V„ pl (y xt d ) = 1.7248 


v/, = S t 


^(S 2 -ZW h Sl 




= S h (0.0081) 


Therefore, 

vi = 0.0437, v 2 = 0.1372, v 3 = 0.2066. 

Since, ILWh v/, = 0.1046, we have C* = C n' + n' ZCh v/ 7 Wh leading to 100 = ri 
(0.1146), which gives n' = 872.6 = 873. 


EXERCISES 

9.1 Let R\ and Ri be the ratio of population totals for characters y and x on the 
first and second occasions respectively. 

Define r\ = y/J x\ n , n = yin/xin, where y\ n (yi») and yi„ (xm) denote the 
means of y(x) for the first and second occasions based on the same 
SRSWOR sample of size n selected on the first occasion. Compare the 
estimators 

Pf=Wn)*i (Rao, 1957) 

ii) R ‘ 2) = nr\IR\ (Rao and Pereira, 1968) 

as regards biases and mean squares by approximate expressions. 



I Finite Population Sampling 

In Exercise 9 1 modify sampling schemes in the second occasion by 

seiecting a matched sample of size m from those selected on the first 
selecting a ma r f • = n _ m f rom uni t s not 

occasion and an unmatched sample of s,z 

selected on the first occasion. Assuming SRSWOR for every stage of 
sampling define the composite estimator as 

Rf = ekm + (1 -0) ft, (Tripathi and Sinha, 1976) 

{jlm + Hyhn - yin )} 


where R 2m = 


n _ yiu 

K ltl — 
*2 If 


^2/m ^ (Xlm X\n 


and yin, (x 2m ), y\ m (3ti m ), yi u (x 2u ) are the means ofy(x) for the second and 
first occasions based on the matched sample and unmatched sample and£, 
b* are the regression coefficients ofy 2 onyi and of *2 on x\ respectively 

A n\ 

and the constant 6 is chosen to minimize the mean square error ofR^ • 
Compare the efficiency ofR 2 (3) with those of#^ and R^ suggested in 
Exercise 9.1. 

5 In a two stage sampling on two occasions, a sampleS] of n psu’s are 
selected by PPSWR on the first occasion using /*,• = (/ = 1,. . N) as the 
normed size measure for the zth unit. If the zth psu is selected X, times (/ = 
1,. .., n), we select A, independent sub-samples of siz em, from M, ssu’s 
for the /th psu by SRSWOR. On the second occasion we select a sample 
A« (0 < A < 1) psu’s from Si by SRSWOR, but retaining all the ssu’s 
sub-sampled on the first occasion. A sample S 2 of \xn (p = 1 - A) 
unmatched psu’s is selected from the entire population again by PPSWR 
and sub-samples by SRSWOR as on the first occasion. 

Suggest difference estimator, ratio type and product type estimators to 
estimate the ratio of the population totals of two characters on the most 
recent occasion, (Okafor and Arnab, 1987). 

4 Suppose a finite population consists of N units and simple random 
samples of size ri and n are selected on the first and second occasion 
respectively. Further, suppose m of the units in the sample n' units 
selected on the first occasion are retained and new selection ofu units (n = 
m ... IS made on the second occasion. Let*,, be the values ofo 

value on the second* 1 ^ firS, . 0cccasion on a ™tched unit for which the 
the same on both ^ l0 " ls - v ' Let 1,16 PPPulation variance S y 2 of y be 

population mean F2cmthesecond^ccas^n^Hmj'the 316 / 0 " eStimating the 

matched and hence the optimum “o^ 
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9.5 Consider following sampling pattern in a two stage sampling on 
successive occasions: 

i) A fraction p of the primary sampling units with their samples of second 
stage units are retained from previous occasion to the current occasions 
and a fraction q of the psu’s are selected afresh (p + q = 1 ). 

ii) All the psu’s in the sample on the previous occasion are retained on the 
current occasion, but only a fraction/? of the sample of ssu’s within each 
psu is retained and a fraction q of the ssu’s is selected afresh. 

Compare the relative efficiency of the sampling patterns to estimate the 
mean on the current occasion and difference between means on two 
occasions. (Kathuria and Singh, 1971). 

9.6 Consider a three phase sampling where the first phase sample is used to 
determine strata, second phase sample is used to estimate the auxiliary 
population means in strata and on third phase bothy’ and x are observed. 
Suggest a three phase stratified ratio estimate of the population mean ofy 
and derive its approximate sampling variance. With a suitable cost 
function derive conditions under which a three phase stratified ratio 
estimate is more efficient than the two phase estimates (Swain, 1973). 

9.7 Consider an SRS WR of size ri selected out of N to observe the variate * 
while a sub-sample of size/? selected out of ri to observe the variatey. To 
estimate the population mean Y, consider the following estimators 

- fy„V 

?\ = *n' 

Vx n J 


where 





^3 ~ r n X + 


f n(ri - 1 ) 
\n'(n - 1 ) 


(y n -r n x„) 



y.-t~ and 


/= i n 


A A A 

Compare Y\, V 2 and as regards their biases and mean square errors 
(Sukhatme, 1962). 

9.8 Suppose the initial sampled out//units in the population is selected using 
SRSWOR and size measure xis observed. The second sample of size n is 
selected out of ri with PPSWR and the main character y is observed. An 
estimate of population total Y of y is given by 
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Y 


N_ri_ 
ri n 



a 

where*' is the total of variate* in the initial sample. Prove that Y is an 
unbiased estimate of Y and its variance is given by 


K(h -——-V 

V(Y) ~N -1 »' m n' y 


If the first sample is solely used to estimated and an independent 
sample of size n is selected with pps using Lahiri’s method in which 
an upper bound for*‘s is known, although X is unknown, show that 


(Raj, 1964) 
V(Y) = 


n [1-1 

\n N. 


si 

X 2 


V + 

r pps ~ 


- 1 - 1 ) 

\ri NJ 


N 2 R 2 S 2 . 


9.10 Show that, provided the sub-sampling fractions are small, double 
sampling for pps estimation will be more precise than double 
sampling for ratio or regression estimation according as the pps 
estimate in single-phase is better than the ratio or regression 
estimates. 

9.11 To estimate the population mean Y a two phase sample is selected as 
follows: In the first phase, a sample of ri psu units is selected out ofN 
psu’s with equal probability and without replacement. Further, m'i 
secondary units are selected from the ith selected primary unit having 

n' 

M, secondary units. Thus, ^ m\ secondary units are used to observe 

i=l 


the auxiliary variable*. In the second phase, a sub-sample of/7 units is 
selected at random out of ri primary units selected in the first phase 
and a sub-sample of rm secondary units is selected out of m', selected 
in the first phase to observe y. 

Consider an estimator of Y 


A 



(I“| y(mi) 

( I w,•*(„,) J 


fl y - ) 

— £«**(»>/) . 
n J 


where u, = MJM; M =— LM,. 

N 

Derive bias and mean square of Y R to the first degree of approxi¬ 
mation.Obtain almost unbiased and unbiased estimators as in case of 
single stage two phase ratio estimator. 

9.12 Suppose we have a universe of Mo secondaries comprising N 
primaries of M, secondaries each. Lety// be the value of j th secondary 
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unit in /th primary unit. Mq is not known and hence Mi. First phase sample 
of ri primary units may be devoted to estimated = (Mo/N) and second 
phase a sample of n" psris are selected and m, secondaries are selected 
from the /th primary unit in the second phase sample. A ratio estimator of 
Y may be given by 

^ ( «" 

Y= 

\i=] 



where 


1 ^ 1 X 

yu and M ■ 

m,- / = 1 n i= , 


i 7 = 1 

A 

Show that Y is biased. Derive its approximate mean square error its. Find 
the optimal allocations of ri, ri', m" = YLm", and m", with the help of a 
linear cost function. Suggest an estimate when a sample ri' is selected in 
the second phase by PPSWR scheme. Derive its mean square error 
(Jessen, 1978). 

9.13 Let there be N units in the population. A preliminary sample of siz eri is 
selected without replacement and let ri, be the number of units in the 
preliminary sample belonging to/th stratum / = 1, ..k. To estimate the 
population mean, consider the following three sub-sampling procedures, 
(i) n, with replacement, all units used in the estimator (ii) m with 
replacement, but only distinct units used (iii) min (n' t , n,) without 
replacement. The corresponding estimators withy as the character under 
study are 

i= i n 


73 = Y — y vh where v, is the number of distinct units in the sample 
,=i n' 

from / th stratum. 

ft r . _ 

T 2 = L— y vh where V, = min {n'i, n,) 
ri 

Show that T\, Ti and 73 are unbiased estimators of Fand V(T\) > V{Tj) > 
V(T 2 ) (D. Singh and B. D. Singh, 1965). 

9.14 Construct a two phase ratio estimator of finite population variance^ 2 of 
the study variable y in the presence of single auxiliary variable* with 
unknown population variance St 2 . Derive its approximate bias and mean 
square error. 
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9.15 Construct Tin’s modified two phase ratio estimator by substracting an 
estimate of the first order bias from the estimate. Derive its approximate 
bias and mean square error to 0(1 /n 2 ). 

9.16 Consider a sampling scheme where an initial sample of size/w is selected 
with SRSWOR and x alone is observed and is classified into different 
strata according to values ofx. Let m, be the number of units falling into 
/th stratum. In the second phase m units are selected from m t units 
following Midzuno scheme of sampling. Suggest an unbiased estimator 
of population total and obtain its variance. 

9.17 Show that under certain conditions the two phase ratio and two phase 
regression estimates are asymtotically normally distributed. 

9.18 Consider an estimate of the ratio Y/X in a two phase sampling where it is 
more expensive to observe y than observing x. Find an approximate 
variance of the estimate and its relative efficiency compared to single 
sampling under a suitable cost function. 



10. ESTIMATION FOR DOMAINS 


10.1 Introduction 

In many sample surveys inferences might be required to be made not only on 
the parametric functions associated with the finite population U, but also for the 
similar parametric functions associated with one or more sub- populations^ 
of size N d (d= 1 , . . ., D) of U. These sub-populations are different classes into 
which the population can be sub-divided and may be classified into three types: 

i) Nd is known and the sampling frame for Ud is available for each d= 1,.. 

D. 

ii) Nj 's are known, but sampling frames for Ud are not available. 

iii) Neither M/’s are known nor the sampling frames are available for 

sub-populations. 

The estimation problems for the sub-populations in case of situation (i) 
can be tackled through stratified random sampling and sub-populations are 
treated as strata. The type of sub-populations which we come across in 
situations (ii) and (iii) are called ‘Domains’ or ‘Domains of study’ as named by 
UN Sub-Commission on Sampling (1950). Thus, we may define a domain as a 
sub-set Ud of U for which sampling frames are not available. Domains of a 
population may be either overlapping or non-overlapping. Overlapping 
domains may also be expressed as unions of disjoint sets of units. 

Domains may be either geographical units or administrative units or units 
having one or more attributes such as ethnicity, caste, sex, religion, mother 
tongue, economic status, employment status, educational level, occupation, 
special characteristics of households such as number of school going children, 
number of employed persons, etc. 

For the inferences on domain parameters information is contained in the 
sample.? = {/i,..., /„(,)} selected from the population U following a probability 
sampling scheme, resulting either fixed sample size or random sample size. The 
sample 5 may be configurated as5 = {s \,. . ., Sd , . . ., s D }, where Sd consists of 
those units from the domain Ud(d= 1,. . ., D) which appear in 5 . The sample 
size n(s), thus, would have configuration 

n(s) - {n\ (s), . ..nd (s ),... «d(.s)}, 
n L i(s) being the number of units in Sd 
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D 

and ^T j n d (s)=n(s). 

d = 1 

Although UN Sub-Commission (1949) was the first to point out the need 
for study of parameters of domains, it was Yates (1953) who discussed the 
problems of estimation of domain means, totals and proportions in some details 
for the data based on simple random sampling. Subsequently Durbin ( ), 

Hartley (1959) and Kish (1965, 1968) considered the estimation of domain 
parameters in a stratified random sampling set-up. Scott and Smith (1971) 
examined the use of prior information about domain siz eN d in connection with 
domain parameters. Durbin (1958) and Tin and Toe (1972) have considered 
estimation of domain totals and means in multistage sampling. The use of 
inverse sampling for the estimation of proportions in domains was considered 
by Tripathi (1973,1991) and for the estimation of domain means by Chaudhuri 
and Mukerjee (1984). 


10.2 Estimation for Domains in Simple Random Sampling 


Let dth domain U d , d= 1 ,..., D constitute Nd units and let s d of the samples of 
size n constitute n d units that fall in the d\h domain. The important domain 
parameters for inference are the domain mean, total, proportion, ratio, variance, 
etc. Consider a simple random sampling to select a sample of fixed size/? from 
the finite population of size A. Although n is fixed, n d v/i\\ vary from one sample 
to another, thus making it a random variable. Two types of argument follow: 

i) In repeated sampling n d is assumed to be fixed. 

ii) n d is assumed to be a random variable with ^ P{n d =k)=\ assuming 

P(n d = 0) to be very samll. k * 1 

a) Estimation domain mean (Yd ) 

Let y d k be the sample value corresponding tofah unit of the dth domain. Then 

n d — 

define y d = £ y dk . Under both (i) and (ii) y d is unbiased for Y d . 

k= 1 


The conditional variance of y d for fixed n d is 

(A J d -n d ) 


V{y d I n d )= 


N. 


Si 


J n d 


where 


s ,_^ (Y k -Y„y 

*=1 


N d -1 
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Its estimate is 


where 


v(yd I n d ) = 


(N d -n d ) 
N d . 





2 _ v ^ dk y d ) 2 

Sd ~h n d -\ 


If n d is regarded as random, 


V(yj) = E [ V(y d \ nd)] = E 


_ 1 _ 

n d 





where E(n d )=nN d / N. 

If Nd is not known, n/N may be used in place of n d /N d because in simple 
random sampling E(n d /N d ) = n/N and thus V(y d | n d ) = [(N - n)IN]S<?ln d . 

b) Estimation of Domain Total (Yd ) 

If N d is known, the sample estimate T\ d = Ndyd is conditionally unbiased for Y d 
and V{Tu | n d ) = Nd 2 V{y d ). 

If Nd is known in advance and nd is regarded as random, T\ d still remains 
unconditionally unbiased and 

ViTu) = V\E {Tu | n d )] + E[V{T U \ n d )\ 

( 1 I 

= E[V(Tu\nd)\ = N<?E - - - Sd 2 

\n d N d ) 




d-/)5J 


where Wj = 


Nd_ 

N ' 


If Nd is not known, but the population total Y is known, we use the ratio 
estimator 



Npw, define a new variate ft* = huk ft, where b dk = 1 if/fcth unit belongs to 
rfth domain and zero otherwise. Write the population total of then’s in the <fth 
domain 

Y d = £ft= £ft*=WF* 

keUj k eU 

In an SRS of size n , 
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ii 

II 

M 

II 

M 

■n 

/ \ 


k es d 

f 

k es 

\ 

t 

Y = 

E 1 '*', 

/E y * 

\ y ) 


\k es - / 

k es J 


Thus, T 1(l 


estimator. 

Tid is approximately unbiased in large samples and its approximate 
variance may be estimated by 

~(N-n) 


v(T 2d ) = N* 


Nn 


rd 


where 


( n ~'K = 'Z( Y Z -W) 1 ; 

k es 


R being equal to^ ty 

If neither N d nor Y is known, an estimate of total Y is given by 


T 3d = 


'N 


N 


— yd =—zl Y k =N y 

KnJ n kts 


Considering repeated samples of size n , it may be seen that Tid is an 
unbiased estimate of Yd with variance 

y ( T M)= YlYzH±(Ii^L. 

n fa, N-\ 


n(N-\) 

N 2 {\-/) 
n(N-l) 

N 2 {\-f) 
n{N - 1) 


EO* -w d Y d ) 2 + Z(0-H' d r d y 

k eUj k tUj 

Zfa-Wj'HN-N.iwjr} 


k eUj 


J j (Y k -Y d ) i +N d Y;(l-W d y+(N-N d )Yj W} 

k eUj 


s N\\-J)[ WdS i + Y*W d (\-W d )] 
n 


= (1 PH±[Sl+Yj(\-W d )] 

nW d 

It may be remarked here that T\ d is more efficient than T id in large 
samples. The estimates of V(Tu) and V{Tu) may be obtained by substituting*/ 
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in place of Sj and N(rid/n) in place of Nj. 

Generalized Estimator of Population Total 

Suppose a sample s of size n is chosen from the finite population U according to 
design p(s) admitting P(k es) = n k >0 and P(k,l e s) =n k i > 0. Two estimators 
of population total Yd for the dth domain may be proposed as 

IJ- 

k esd 


and 



f v ' 

I K k 

\k es,i K k J 



10.3 Estimation for Domains in Stratified Random Sampling 

Sometimes it may be desired to consider domains of study in all strata in a 
stratified random sampling. For instance, when strata represent geographical or 
administrative units, estimates may be wanted for the whole population as well 
as for the males and females, social groups, sections of populations having 
different characteristics, etc. The basic formulae in this context are derived by 
Durbin (1958) and Hartley (1959). 

Let the population be sub-divided into! strata, each having units (h = 

1,..., L) and Nhd units fall into d th domain. We have Nh = YLNhd and /?/, = ! nhd . 
Defines as the measurement on/rth unit of the dt h domain in hth stratum. The 
sample mean and population mean corresponding todth domain of Mh stratum 
are defined as 

Ja' Vi _ Nhd y 

yu = J——and Y m = V - — respectively. 

* = 1 n hd k-\^ hd 

The population total and mean for the domain^over all strata are respectively 

Yd=Y,N hll Y M 
and h 

_ y 

Y d = where N d = I>N h d 

Known Nhd 

If ,/are known for all A and the estimates of K,,and Tjmay be calculated as 

A A jK 

Y d - ZNhd yhd and Y d - ——, where nhd are random variables 

N d 

The formula for V(y hd ) remains valid as in Section 10.2 provided n h d> 0. 
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Unknown Nhd 


. M OM 

z**- 


h n h k= 1 

and its estimated value are found as in Section 10.2, introducing a 
variate = bjhkyhk, where bjhk = 1, if ^th unit of the /rth stratum belongs to the 
dt h domain and zero otherwise. 

Domain mean Y d = Y d /N d is estimated by Y d = Y d /N d , where N d is an 
unbiased estimate of N d and is given by 


Thus, 


»« 

* «4 


h n h h / h n h 


A 

which is asymptotically unbiased forExpressing T 

✓ \ / / \ 


f-i"* (i>« /i"* (£«• 

h n h \k= 1 // /i U=1 


A W/» U-1 


A 

we find sampling variance off </ and its estimate are given by 



+ nJl-m)lyu-Y<r 

\ n H J . J 

Durbin (1958), following the aforesaid technique, has derived expres¬ 
sions to estimate the mean for the entire population, if the sample is incomplete 
due to non-response and also has extended the method to multi-stage sampling. 

Kish (1965, 1968) has suggested formulas for comparison of sub-class 
means or domain means in stratified random sampling under proportional and 
optimum allocations and extended the discussion to sub-classes and 
comparisons in cluster sampling. Sedransk (1966a, b; 1967) has ma e 
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extensive investigations of the problem of optimum allocations for analytical 
studies of the difference of means from survey samples. 

10.4 Regression Estimators for Small Domains 

When domains are large in number, one would expect that the sample data in 
some domains may be small in number, even if domains are non-empty. The 
small domain estimators may be improved with the help of auxiliary 
information besides the knowledge of Nd. Information on these auxiliary 
variables may be made available from registers and other available sources. 
Sarndal’s (1984) generalized regression approach involves the fit of a linear 
regression model (£,) (with y as the variable under study and x, a /?-vector of 
auxiliary variables) of the formE^) = x'k p and V(yk) = a* 2 , to the observed^’s 

for k e s under a sampling design with inclusion probability^. The estimated 
regression coefficient vector 

r v 1 

P= Yj X k X 'k / G l n k 

- 5 / J 5 

A. 

The resulting predicted values are y k = x'k P. The corresponding 

A 

theoretical residuals are ek = yk - yk = yk- x'*p 

Sarndal (1981, 1984) and Hidiroglou and Samdal (1985) borrowing 
strength from the data outside the domain, proposed two approximately design 
unbiased estimator of Yd as 

0 Yd =X yk + Yj *<1 ’ ifA ^is unknown. 

h) Yd - X Uj Yk + where N d = V and Nd is known. 

Nd Sj nk S<1 71 k 

The first term ^ y k is called the synthetic term and X ek/nk may be called a 

U d s* 

A 

correction term. In forming the domain estimator Yd auxiliary information is 
judiciously used and the estimator is design unbiased regardless of whether the 
model is true or not. Y'd has ordinarily smaller variance compared to Yd. 

In samples in which nd is extremely small, Sarndal and Hidiroglou (1989) 

» A A 

pointed out that the variance of the correction term in both Yd and Y'd can be 
large leading sometimes to estimators outside the permissible values. To avoid 
such uncomfortable situation, Sarndal and Hidiroglou (1989) have introduced 

A. A 

as dampening factor to the correction term in Y'd when Nd < Nd and their 
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dampened regression estimator for Yd is 


Uj 


/ * \ 

UJ 


H-\ 



with H = 0 if N d > M/ and H = h if Nd < Nd, where h is a suitably chosen positive 
constant and may be preferably equal to 2 as a general purpose value. 


10.5 Small Domain Estimation When Population is 
Divided into Groups 


Suppose the population C/=(l, .. ,k,.. .,N) be divided intoDnon-overlappling 

domains U\, . . :, Ud, . . U». Let Nd be the size of Ua. Further, suppose the 

population is again classified into G non-overlapping sub-groups (<Si,.. S g ,.. 

., Sa). Groups are formed in such a way that internal homogeneity, as far as 

possible, is maintained. The groups are usually small in number compared to 

number of domains and may also be treated as post-strata. The cross 

classification leads to DG cells Ud g (d= 1,.. ., D\ g = 1,..., G). Let Nd g be the 

known size of Ud g . N = E Nd = 2 E Ndj. Let s denote the sample of size n to be 

d s 

selected from U under the sampling design/?^) with ti* > 0 for all k and nid>0 
for all k*l. Let Sd and Sd g be parts of sample falling into Ud and Ud g . 

To estimate Yd, we have the following direct estimators 

i) Horvitz-Thompson estimator 


U 

ii) Post-stratified estimator based on groups 

A - 

tdiPostG) = 2l N d g Ysdg 


g 


where Nj g is known and y S(lg = 


( \ 

K 

V s dg n k J 



ill) Post stratified estimator based on groups using auxiliary information 

Smallness of the size of s dg may be countered favourably by forming a 
post stratified estimator using G group ratios with the availability of auxiliary 
data on x correlated with variable y under study, given by 

^(PostG/R) =^X dg , 

g V x sdg j 

where J,* is defined analogously to y sdg and A* is the known population total 
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of x for the ( dg ) th cell. The post-stratified estimators are based on the 
assumption that s<i g are non-empty and are approximately unbiased. In casen</ g 
= 0 for some cells, the sampling practitioner may attempt to combine cells to 
avoid the problems arising out of zero counts. Another combined ratio type 
estimator based on groups is given by 


td(PostG/R) “ 


Xdg J (X ^dg ysdg) 

s_ ' _ 

OC Ndg x sdg > ) 


Synthetic Estimator 

If all cells are non-empty but some% are extremely small, y s dg is likely to be 
highly unstable and hence td(Post G) may not be the right estimator. In this 
connection Gonzalez’s (1973) method of synthetic estimation may be 
followed. The idea behind synthetic estimates in the words of Gonzalez (1973) 
is as follows: 

‘An unbiased estimate is obtained from the sample survey for a large area; 
when this estimate is used to derive estimates for sub-areas on the assumption 
that the small areas have the same characteristics as the larger area, we identify 
these estimates as synthetic estimates'. The synthetic estimates, thus 
constructed may be both design based as well as model based. 

Thus the well known synthetic estimator for Yd is defined as 

G 

^d(SynG) = 

8 = 1 


f 


where y sg = 


yk 


\ K 


V s.g ^ k J 




y 1 

V s.g ^ k J 


D 


which is the sample weighted mean ofy in the sub-set s.g= U s dg , that belongs 


to U.g. As the groups are assumed to be small in number, the probability of an 
empty set s.g is considered negligible. 

Assuming availability of information on an auxiliary variable*, the 
synthetic ratio estimator is defined as 


tdiSynG/R) 


G 

- ^ x dgi^y s.g / x s.g ) > 
g=1 


The synthetic estimators are unbiased if small area means or ratios are equal to 
their larger area counterparts. Their variances are ordinarily very small. In case 
there is possibility of a substantial bias due to a small departure in the 
assumption that parts of small areas within the same domain resemble each 
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other their use should be avoided by the practitioners In such a situation it is 
sometimes suggested to combine direct and synthetic esumators by ustng 
“table weights to arrive at a composite estimator wh.ch ts tnfact a we.ghted 
average of a design based estimator and a synthetic eshmator. More about 
synthetic ratio estimators and their modifications are foundI in^Gonzalez (1973) 
Gonzalez and Wakesberg (1973), Singh and Tressier (1976), Ganghurde and 
Singh (1977, 1978) and others. 

Synthetic estimators based on model based approaches are developed in 
the lines suggested by Royall (1970), Royall and Herson (1973a, b), Cassel, 
Sarndal and Wretman (1976), Samdal (1980) and Wright (1983). A rigorous 
treatment of small area estimation in survey sampling is contained in 
Mukhopadhyay (1998). 


10.6 Domains in Multi-stage Sampling 

Consider a problem of multi-stage sampling in which/? first stage units are 
selected from P first stage units with replacement. From rth first stage unit 
of size N h m s.s.u are selected. We are interested in estimating overall domain 

P Ntd 

mean Yj where Y d = £ ^ Y ij(d) / N d , where N d is the total domain frequency 

/=i 7 = 1 

P P N, 

and is equal to N d = ^ N td = ^ , where Iij( d ) = 1, ify'th second stage 

i=\ i'=l j 

unit of the zth first stage unit belongs to the domain d and zero otherwise. 

The estimator of Y d is the sample analogue 

I p "td i p 

y* =—Z Z ym = — Z y>w • 

ftd / = 1 j d i = 1 

P "td P 

where n d = £ £ = Z n u ' 

f=l ./ = ! f = l 

To the first order approximation, 


/ p ^ 


£(y d )= 


Z Z ym 

V/ = 1 


E{n d ) 


— ~ r d 


Define Z/y ///(</) (Vy ~ Yd) ~ yij(j) - Iij{d) Y d . Thus, we have 

1 A sX d \ p 

Z z * = 


n d i'=l 7 = 1 H d p ; = i 


1 : 

n d 


where z is the mean of the p values £ z^ = z r 
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The estimate of variance of yd is therefore 

-.2 


ny<t) = 


E(n d ) 


V{Z), 


Now 


< ? ( 2 ) = -^-p;Z( z /- 2 ) 2 * 

P(P~l)i=\ 


Therefore 


v{y d ) = 


( p-l)[E{n d )] 2 tA 




Write 


p 

I 

i =1 


X(z,-z) 2 =2 z ?“P z2 


=z(y>w- n ^y~(^-^y 


Assuming n d = E{n d ), we have 

* p £ 


V{y d ) = 


.-7 \ 2 


(/7-l)(« rf ) 2 


£(.V/(d) ” n id^d) - 


Since, 


1 

P -1 


we have 


P -1 


(p-i) p-1 


Hence, 


^ i \ 


1 + 


1 


V p-V 
This leads to 


V{y d )= 


X (ym -««r0 2 


(p-l)(«d) 2 


A 

As ^(Pd ) is still dependent on Kj, we may replace yy for Kj to first order 
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approximation and get an approximate estimator of^j) given by 


V{Jd)= 


_ 1 _ 

(n d ) 2 


i = l 


10.7 Estimation for Domains in Sampling on Two Occasions 

The problem of estimating sub-population parameters on successive occasions 
arises due to (i) change in the value of units in domains, (ii) change in the 
number of units in domains, (iii) change in domain structure from one occasion 
to another, that is, entry (or exit) of some units into (or from) a domain. 

Tripathi (1988) has considered the problem of simultaneous estimation of 
D-domain means using a partial replacement scheme. 

For estimating change in domain means in repeated surveys Holt and 
Skinner (1989) have suggested procedures in case of fixed population- fixed 
domains and fixed population-varying domains. 

10.8 Conditional Inference 

In survey sampling the sampling design defines a samples' consisting of all 
possible samples s and the associated probabilities of selection. The choice of 
an estimator depends on mean square error criterion under repeated sampling 
with S taken as the reference set. However, the unconditional mean square may 
be appropriate one at the design stage but Fisherian school of thought (Fisher, 
1935) does not regard the sample space S as the relevant reference set for 
inference after the sample has been drawn if the sample contains recognizable 
subsets. However, the choice of relevant reference set may not always be 
unique. In sample surveys it is usually seen that the sample desired to be 
sampled is not the one actually observed. Holt and Smith (1979) have 
emphasized that the sample s may be regarded as unique and relevant reference 
set in some sense. The uneasiness in solving inferential problems with 
unconditional argument when the sample size is random has long been felt by 
Bartlett (1937), Cox (1958) and Bartholomew (1967). If the sample sizes are 
random such as domain sample sizes and their distribution completely known, 
then the inference should be reasonably made conditional on the observed 
sample sizes (Durbin, 1968, Swain, 1969). Swain (1969) has shown that for 
high confidence coefficients the conditional intervals are shorter than the 
unconditional intervals oh the average. J.N.K. Rao (1985) has given a thought 
provoking discussion on conditional inference in survey sampling, which 
forms the basis of the present section. 

In simple random sampling with replacement conditional inference gives 
encouraging results. If n units are selected out of N units with SRSWR, the 
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sample space consists of TV” samples. Let v be the number of distinct units in the 
sample and /, be the number of times i th unit occurs in the sample 5. It is now 

worth comparing the sample mean based on all the drowsy,, = — V tj y,and the 


/ es 


sample mean based on distinct units Jv. Both y n and y v are unconditionally 
unbiased under the reference set S and the unconditional mean square error of 
y v is always smaller than that ofy„, showing thereby that y v should be preferred 
to y n as an estimator of population mean. This supports Durbin’s (1968) 
argument to condition on observed value v. 

In SRSWOR the relevant reference set is S of f ^ 


\nj 


samples and no other 


recognizable relevant subset seems to exist. But when we try to impose 
post-stratification on the observed sample.? of size n, the sample configuration 
is seen to be n = (n\, m ,.. m) belonging to k post strata with known weights 
Nh/N. In the absence of frames for the post-strata, it is a reasonable way to 
obtain a more precise estimate. The relevant reference set is now the set Sh of 

nr* sets of samples having the observed configuration n since the 
V n h ' 

distribution {n\, m, . . ., m) is completely known. For m > 1, the usual 
post-stratified estimator y pst = 'L(N h / N)y h is conditionally unbiased and 
moreover the unconditional variance of y pst is smaller than the unconditional 
variance of unstratified sample mean y. It may be pointed out here that is 
conditionally biased because 


£(>>!«, ) = E 


Y J — yh\n\ . n k 

h=\ n 


*-i n 


For the case nh = 0 for some /i, a situation which occurs when« is small and the 
number of strata chosen is very large, the post-stratified estimator may be set up 
as 


Ypst = I' W h y hi 

where the summation S' is over the strata having non-zero n h . y ps , is both 
conditionally and unconditionally biased. The usual procedure is to collapse the 
strata to make all nh > 0. 

An unconditionally unbiased estimator as given by Doss, Hartley and 
Somayajulu (1979) is 


>>DHS = Yj ^ \ y*' 

h E{a h ) 


where ah 1 if at least one unit belongs to stratum h in the sample and ah = 0 
otherwise, 
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Doss, Hartley and Somayajulu (1979) have modified yp H s to make it 
translation invariant in the form of a ratio estimator, given by 


— * __ 

YdHS - 




a h 

E{a h ) 



which is conditionally biased. 

Conditional inference in the Estimation of Domain Mean 
an Domain Total 


Domain mean 


In SRS, if the domain size Nj is known, it is appropriate to condition on 

observed n d . If nj > 0, y d is conditionally unbiased for fixed size rid as the 

corresponding sample is an SRS. However, the estimator is unstable for small 

domain and fails to be defined for rid = 0. To overcome this difficulty the 

estimator may be modified as 

“ “ 

y'd = “ d x yd > with wc/>0. 

_E{a d ) 

7d is seen to be conditionally biased. 

For small area estimation Samdal (1984) proposed an estimator 



where 

y=Y J w d yd,v>d = —and Wd = —. 

n N 

Hidiroglou and Samdal (1985) proposed a further modification such as 

y d if w d > W d 

— * / \ 2 

yhs - ~ w d . ... . 

UT lfw <t<w d 
l \W d j 

If N d s are unknown, conditional argument may still become relevant if 
Nd ,s unr ®lated to the parameter of interest Y d . 

Domain total 

The domain total Y rJ = N d Yj is estimated by Y d = N d y d if N d is known. If N d is 
not known, Y d is estimated by 
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Yd=N d y d y d ,n d >1 

\ n J 

and Pirid = 0) is assumed to be negligible. 

/> /s 

The conditional bias of Yd is (Nd-Nd) Yd. 

If we have prior information, say N*d < Nd < Nd**, the conditional 

/V 

argument may be seen to be relevant. J.N.K. Rao (1985) has modified Yd as 

KJ* if N d <N* d 

N d y d iiN* d <N d <N* d * 

** A ** 
yd'rfN d >N d 

The absolute conditional bias of Y d is smaller than that off*/if either Nd<Nd 
or Nd > Nd**, while Yd* = Yd in the interval Nd* < Nd < Nd**. Hence, Yd* is 
conditionally better than Yd. Further, the unconditional MSE of Yd is also 

A A 

smaller than that of Yd, although Yd* is unconditionally biased. 

NUMERICAL EXAMPLE 

Example 10.1: A sample survey was conducted in Champua Block of Keonjhar 
district, Orissa (India) in 1992 relating to annual household income, total 
expenditure (x), expenditure on food (y) on a target population of 300 low 
income households with a simple random sample of 50 households. The 
relevant data in a summarized form are given below. Estimate the average food 
expenditure per household and total expenditure on food items of the domain 
‘ d' consisting of households having annual income less than Rs. 10,000/- with 
and without use of total expenditure on all items as auxiliary information. 
Given that average totaj expenditure in the concerned domain in the population 
is Rs. 9,500/- and total expenditure on all items in the population is Rs 
3,500,000/-. 

Let jc represent the total annual expenditure of a household andy its 
expenditure on food itmes, expressed in thousand rupees.Given, 

N = 300, n = 50; Nd= 70, nj= \ \,Xd= 9.5,X= 3500 

X n = 727, =48231 

k esj k esj 

^ X k = 100.5, Y, x l = 928 - 97 - 'Z x t y k= 668.03 

k esj k esj k esj 

and for the entire sample 
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V K, = 359.8, y Yf = 2728.07, £** = 51&8 > 

V X\ = 5605.12, y 2T t F* =472L59 

it es ^ ^ 

Solution : 


a) Estimate of domain mean 


The simplest estimate of domain mean 


A 




= 6.61 thousand rupees. 


If Nd is known, 


v{yd ) = 


(W rf ~n d ) 
N d 

(70-11) 
70 


( s 2\ 

* d 

<” d J 


= 0.0140 (thousand rupees.) 2 


If Nd is not known 


, (A '-n)s* 
Hy d ) = — 7~— = 

N n d 


(300-500) 

300 


0.183^1 

11 J 


- 0.0138 (thousand rupees.) 2 
The ratio estimate 


y dR = 


V 

\x d ) 


X , 


where 


- (72.7/100.5) (9.50) = 6.8721 thousand Rs., 


y d = Y.Y^andxd = ^X k . 

k esj k esj 


A 

An estimate of approximate variance of Y dR is given by 


(70x11) [ 10 J 


L923 


= 0.0149 (thousand Rs.) 2 if N d is known. 


and 


KF*) = 


(300x11) L 10 


19423 


- 0.0147 (thousand Rs) 2 if N d is not known. 
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b) Estimate of domain total Yd- 

If Nd is known, Yj = N d yd = 70 (6.61) = 462.7 thousand Rs. with estimated 
conditional variance v(Yj) = 68.60 and estimated unconditional variance 

v(YJ) = (1- f)s 2 d = 64.05 (thousand Rs.) 2 , 

\_nW d 

where f =n/N - 50/ 300. 

If Nd is not known, 

f 300 

Y' d = — \(n d y d ) = -—(727) = 4362 thousand Rs. 

V nJ 50 

and 

v (?d) = ~ ~~~P' ^ nd ~ V)s d +nd ^d ( X ~ w d ) 2 +{n-n d )Pjw 2 ] 

= 11531.86 (thousand Rs.) 2 

where sd 2 = 0.183, y d = 6.6\,w d = rid/n = 0.22. 

On using auxiliary information on* 

YdR =f—J X = —- 1 (3500) = Rs. 490.46 thousand, 

\ x J V51&8/ 

where x= ^ X k . 



{300(300-50)} r 

( 50 ^ 49 ) 1 64] = 1240L50 (thousand Rs.) 2 . 


EXERCISES 

l0 ' 1 ofTseTJf dn e the S T Ple t0talS based ° n nj and * “»»* °f the 7 th and Ath 
of a set of domams (non-overlapping) into which the units in a simole 

n J and Xk be the corresponding totals of the auxiliary variable M 

"" ~ " f »*>>, ii) CA) - <2ZSffS£ 
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(yk/xk) X, iv)yj -yk, v) <y/x) - <**). * be.ng the sample total based on „ 

observations, vi) (y/xy)/ 0 f the finite population oW units 

10.2 Suppose the domain sizes N\, /V 2 ,.. or h f , 

are not known. We take a preliminary sample of size« and let «i fall into 

Domain D\. Choose a sub-sample n\ out of «i and observe 
v-characteristic to compute % An estimate of the population total Y ] of 

Domain D\ is given by Y\ = N (n\ In )y\- 


Derive the mean square error ofTi. 

10.3 Derive formulae to compute variance of the difference of ratio estimators 
for two domains in (a) stratified sampling, (b) cluster sampling and 
suggest an estimate of the variance in each case. 

10.4 N units of the population are classified into& mutually exclusive classes 
having (N\ , . . ., Nk) units. The number of units in the zth class (Ni) 
possessing attribute A is N/a (i = 1,.. k). In a simple random sample of 
size n let the number of units falling into zth class be m and those 
possessing the attributed be n,A (z = 1,. .., k). 

n\, m,..nk follow a hypergeometric distribution 


P(n\, « 2 , •.nk | N\, N 2 y ..., nk) 

V nj 


'Ni 

Ui 


'N k ' 

\n k ) 


(N\ 


Further, m A for n t > 0 has a hypergeometric distribution given by 


P{n iA , n t - n iA | N iA , M -N iA ) = 


yn iA J 


V n i~n L4 ) 


A U 


flnmg p u -n la in, P u =N U / N i and P t = N, / N, find expected 
value and variance of p , A Suggest an unbiased variance estimator 
Hmt E(p’ u )=E [E(P; a |„.)] = P; A , and 

np' i A)=E[V(p' iA \n i )] + V[E(p' iA \„ i ) ] 


= E N '~ n ' p IaO~Pu) 

. N,-l „ f 

Using an approximation 


, N, 

N ,-1 


P ‘A (1 -PU) 


E~-~ 

L »/ N: 


£(«/-') = 


f—] 

v nP : , 


1 + 


(1-^) 

nP, 
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V(p'u ) = 




|_(JVP,-1)J vn 

Exact unbiased estimate is given by 

,,, r(jv,-»/)i r pm (j -PM) 

v ( P,a ) = — - — - t :— 

N, («, -1) 


l'l (N-n) t (1 -Pi) 
n) N nPs 


L N; j[ j 

10.5 Consider two small areas k (= 1, 2) and the super-population model 
E m iyi) = p, V m (yi) = a 2 (/ = 1,.. N) 
cov (y,, y/0 = 0, (/ * / = 1,. . TV) 

Consider two predictors of small area total Y \: 

Ti^NyZyiln 

S 

T 2 = £>>,■+—!--X 

s > n s 


where n\, m ave random, but n (- n\ + ni) is fixed. Assuming that the 
sample sizes n\, n 2 are conditionally fixed, show that the relative gain in 
efficiency in using Ti compared to T\ is 

Q_ v \ ~ V 2 _ _ n ] n 2 _ 

V 2 (N-n x )(N X +n 2 ) ’ 

where 


v k - Em (Tk ~ Y\ ) 2 ,k = \,l (Mukhopadhyay, 1998). 



11. VARIANCE ESTIMATION IN 
COMPLEX SURVEYS 


1.1 Introduction 

Large scale sample surveys generate huge amount of varieties of data which 
necessitate efficient methods for its analysis and interpretation. The estimates 
computed from survey data need to be precisely estimated and a commonly 
used measure of precision is the variance of the survey estimator. In practice the 
variances are not known and hence must be estimated from the sample survey 
data. Structurally, the variance of an estimator depends on the form of the 
statistic and the sampling design and hence the estimator of variance depends 
on these considerations. The calculation of variance estimators for complex 
surveys often involves heavy computations. In such situation, it is quite 
appropriate to attach the property of simplicity to the variance estimators 
without compromising significant loss in accuracy. This calls for constructing 
reasonably efficient non-standard variance estimators in complex surveys in 
place of standard ones known in the literature, the computations of which may 
be time consuming and cost prohibitive. Wolter (1985) has given an excellent 
account of both the theory and applications of the various variance estimating 
techniques for simple as well as complex survey designs with exhaustive 
references on the subject. 


11.2 Taylor’s Series Expansion Method - Linearization Technique 

Consider a given finite population consisting of/V units. Let 0 = (<9,, Q ) 
denote vector of population parameters and let t = (t u t ) denote 
corresponding vector of estimators based on samples of size n following anv 

sampling design. Suppose^.. 6 P ) is the parametric function of interest to 

be estimated. Let us take g(h,..., t p ) as an estimator of g(6», e If o<t\ 
possesses continuous derivatives of order© in an open sphere’contain L fnd 
to M ... k by k „„ Taylw’s h.v. 


g(t)=g(e)+£(/ ; . -e ; .) 


7 = 1 


£sr(t) 
dt 




J/=e 
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where 




./ = !* = 1 


1 d 2 g(t*) 

2! dtjdt k 




and t* is between t and 0. In finite population the validity of Taylor’s series 
expansion is assumed to be approximately satisfied. Further, assuming as an 
unimportant component in the expansion, a linear approximation tog(t) is 
given by 

g(t)=g(0)+I(*/ -0/) 


*,(0 


The variance of g(t ) is 


r[g(t)]=L 

5g 

2 

V(tj)+ 211 

5g 

5g 

j 

- 8t i. 

t-e J<k 

_ 5t i 

' dt k _ 


co y(t j9 t k ) 


An estimate of F[g(t)] is given by substituting the sample estimates in place of 
the population parameters. 

Although convergence problemsexist in expanding a function of statistics 
calculated from samples from finite populations, the method is widely used in 
sample surveys in linearizing non-linear estimators, so as to enable one to 

compute approximately the variance of the estimate and hence the estimate of 
variance. 


11.3 Method of Random Group (Replicated or Interpenetrating 
Sampling Method) 

The random group method of variance estimation consists in selecting two or 
more independent samples from the same population following the same 
sampling design. Separate estimates of the population parametric function in 
question are constructed from these samples and then sample variance is 
computed among estimates. This is one of the oldest techniques of variance 
estimation in complex surveys and these samples are called Interpenetrating 
samples by Mahalanobis (1944, 1946) and Replicated samples by Deming 
(1956). Hansen, Hurwitz and Madow (1953) referred this technique as Random 
group method. The random groups may be either independent or dependent 
depending on the method of forming random groups. 

Independent Random Groups 

Samples are drawn from the population with replacement so that samples 
become mutually independent and so also the estimates derived from these 
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SamP Let t g be uncorrelated random variables with E(tj) 6. Then, 

to 

*y-i 

is an unbiased estimator of 6 and an unbiased estimator of K(r) is given by 


which is called a random group estimator of variance. 

In cas^of application of random group technique to non-linear estimators, 
one may compute an alternative estimator say t of 0 of combining all random 
groups. For example, in ratio method of estimation the usual parameter of 
interest is 0= Y/X. 

Define Y, and Xj as the estimates of population totals of T andXffom the 
;th random group. Then, 



-Ml 

^gj 


/ * \ 

II 

\Xjj 


(1.* j 1 *) 


The estimator Ms to be preferred as an estimator of 0 because it has smaller 
bias than t. 

In this situation we may form two random group estimators of variance of 
t, such as 


Vl(0 = 


Ko-o 2 

g(g-i) 


and 


v 2 (0= 


Z(o-o 2 

g(g-i) 


It may be verified that vi(f) < v 2 (t), showing thereby v 2 (t) as a conservative 
estimator of V(t). 

Us,n S second order Taylor series expansions it can be shown that Bias (vi) 
- bias (v 2 ) and to the same order variances of vi and v 2 are equal. 
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Non-Independent Random Groups 

As it happens in sample surveys, sample is usually selected without 
replacement, not in a series of independent random groups. However, random 
group technique may still be adopted by randomly dividing the original sample 
into random groups and forming estimators from each group and also the 
variance estimator. The estimates from each random group no longer remain 
independent because of sampling without replacement. 

Due to lack of independence of random group estimators v(/) becomes a 
biased estimator of V(t ), although bias, tending to be slightly positive, may be 
either negligible or unimportant in large scale sample surveys. When? is either 
linear or non-linear, detailed discussions in different complex designs are given 
by Frankel (1971), Dippo and Wolter (1984). 

All the g replicated samples may be made mutually statistically 
independent by replacing units drawn without replacement after each set of 
draws. Koop (1967) has put forward an interesting result that the replicated 
samples of unequal sizes are more efficient than those with equal sizes. As the 
replicated samples taken together may not have all the units distinct, Pathak 
(1964) has obtained improved estimators of population mean and variance by 
Rao-Blackwellization in interpenetrating subsampling where all the sub 
samples are of equal sizes and drawn by simple random sampling without 
replacement. Bedi (1987) extended Pathak’s (1964) technique to the case when 
the interpenetrating subsamples are of unequal sizes, each drawn with 
SRSWOR. 

Roy and Singh (1973) considered a scheme of selection of sub-samples 
without replacement in such a way that occurrence of common unit between the 
sub-samples is avoided. Assume that each sub-sample contains/?? units selected 
by SRSWOR, the first sub-sample is selected from all the TV units, the second 
sub sample from the remain \ngN-m units and the third from the remainingN- 
2m units and so on. With g sub samples we have n — mg. The proposed estimator 
of the population total Y is given by 

-_ (*i +.~ + t g ) 

g 

where t\=Ny\ and t, = m (i +... + y/- 1 ) + [TV- (J- 1 )m] y^j = 2,..., g, where 
Yj * s the mean ofy'th subsample. 

It may be varified that E(t) = Y and cov (tj, 4) = 0 for all j * k. 

\( i N / 2 \ 

v (l) = — N 2 -Nmg+ — (g 2 -l) S 2 . 

\mg) \ 3 J 
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g 

An unbiased estimator of V(t) is given by v(/) = 

7-1 


(t-i) 2 

g(£-l) 


Stability of Random Group Estimators 

The stability of the random group estimator v(/) is studied by computing its 
coefficient of variation. The number of groups g has a larger impact on 
decreasing the coefficient of variation ofv(/) and hence increasing the precision 
of the variance estimate than the size of the groups m (Wolter, 1985). 


11.4 The Jackknife Method 


Jackknife was first introduced by Quenouille (1949, 1956) to reduce the bias of 
an estimator. Tukey (1958) suggested that the sub-sample estimators derived 
from the Quenouille’s technique may be reasonably used to provide a simple 
estimate of variance. Illuminating contributions on the jackknife are due to 
Miller (1964, 1968, 1974a, 1974b), Brillinger (1966), Arvesen (1969), Efron 
(1982) and others. 

Given a sample s of size n -y\, y 2 , . . ., y„ under an infinite population 
model. Assume thatyi, . . y n are n independent and identically distributed 
random variables with distribution function F{y). Let 0 be any parameter of 
interest estimated by estimator / calculated from the whole sample. Let 

E(t)=e + A-(Q) + M e ) + 
n n 2 


where b\ (6), bi (0 ),..., are functions of 0 but not of n. 

Now, partition the complete sample intog groups of m observations each, 
such that n = mg. Let /(/) be the estimator of the same functional form as/, but 
computed from the reduced sample of size/w (g- 1) obtained by omitting they'th 
group. Define ‘pseudo values' tj = gt- (g- 1) /(,> 

g /. 

Quenouille’s estimator is defined by t Q = ^ —, 

y = l £ 

As /(/) is of the same functional form as that of/, we write 


£(*(/)) - 9 + 


b \ ( 0 ) , 

m(g -1 ) 


Mg) 

(m(g-l))2 


Then, E(t Q ) = 0 + 


M6) 

™ 2 g{g-1 ) 


+ . . . 


The pseudo values // are approximately independent and identically distributed. 
The jackknife estimator of variance / is given by 
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/- \ V 1 /' ~*Q ) 2 


' g-? 
V g ) 


Z (*u) - *(■> ) 2 


7 = 1 


l 8 

where F 0 =-£/ (y) . 

S / = 1 

In practice vi(Fg) is also used to estimate the variance off. _ 

Tukey (1958) suggests that {t Q -0)/ (f y -t Q ) 2 / g(g-l) may be 

taken as having a Student’ sf-distribution with (g-1) degrees of freedom for the 
purpose of constructing confidence intervals for the unknown parameter 0. 
Alternatively, one may also use the estimator of V(to) as 

, Z(0-0 2 

M‘q)= —7— 7 — 

gfe -1) 

which is a conservative estimator compared tovi(Fg) because 

(t~i Q ) 2 


v 2 (^)=v, (t Q ) + 


(g-1 ) 


In SRSWR to is identically equal to sample mean y and its estimate of variance 
is same as that of y in SRSWR. In SRSWOR, also to is equal to y. However, 
vi to) is no longer unbiased estimator of V(y) and in fact serves as a conservative 
estimator, as£(vi (Fg)] = S 2 /n. For PPSWR sampling the jackknife estimator of 
the population total is the usual PPSWR estimator and the jackknife variance 
estimator is F(F). In unequal probability without replacement sampling the jack 
knife preserves the unbiased character of the Horvitz-Thompson estimator of 
population total. For n/ 5 S sampling, 


Bias [vi(Fg)] = 


n 

(^i) 


[ V(Yppswr) - V(Y n ps)]. 


It may be observed that there is no worthwhile computational advantage 
in jackknifing compared to traditional techniques. The real utility lies in bias 
reduction and variance estimation of non-linear estimators such as ratio, 
regression coefficient, etc in complex surveys. Durbin (1959) has applied the 
Quenouille’s method of bias reduction to the estimation of population ratios 
and has shown that the technique not only reduces the bias but also variance. 
Similar conclusions have been derived by Swain and Mishra (1992) in 
connection with estimation of ratio of population variances. Chakrabarty and 
Rao (1967) and Krewski and Chakrabarty (1981) have studied on the stability 
of the jackknife variance estimator in Ratio estimation. Jackknifing non linear 
estimators based on stratified samples for variance estimation have been 
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discussed by McCarthy (1966), Lee (1973b), Jones (1974) and Krewski and 
Rao (1981). 


11.5 Method of Variance Estimation Based on Balanced Half-samples 

A survey sampler, sometimes, uses stratification to the point where only two 
primary units are selected from each stratum. In such a situation only two 
independent random groups or half-samples are available for variance 
estimation which may not be precise enough as desired. The balanced half 
sample replication is a variance estimating tool which brings about increase in 
the precision of the variance estimator with the help of psuedoreplications. 
McCarthy (1966, 1969a, 1969b), following Plackett and Burman (1946), 
developed the mathematics of balancing half-samples. 

Consider a stratified random sampling with H strata, where two 
independent selections are made from each stratum. Lety/,, denote the value of 

ith unit of Mh stratum in the sample. An unbiased estimate of the population 
mean is given by 

yst = Y J w hyh > 

h = \ 

where 


y h ~ 04 ) + y*> 2). 

An es rr is 'f n)= ^ s "* ** “ W > 2 w t d l ’ where ->■«• 

A half sample replicate is formed by choosing one of two observations 
from each stratum. Thus, the half-sample estimate off is y h < = 2 w h y h , (i = l or 

l '!, may b !, See " * hat there are 2 " possib,e halp samples and hence the average 
of al! possible half sample estimates is y„. Lety*,, be one such half-sample 
estimate. Then, y 

T/iv.i — y« - S Whyh 1 - y'^'Z Wh (y/,i +y/, 2 ) 

=y 2 zwh 0*1 - yh 2 )=yzwh d h . 

In general form y„, - y„ = QQ (± W, d, ± W 2 d 2 ± .. . ± W„ du) and 
deviation from y„ is of the form ’ nd s< l uared 

(T* s -y„) 2 =y 4 lw}dl +y 2 2 ±w h Wjd„dj 

- "S') p °“' bl * - »>' I a. *. m ■ 

Taking further expectation over repeated sampling from population 

£(». - y,,) 1 =y 2 zwy sy = v(y sl) . 

NOW, suppose we draw with replacement a sample of* half-samples. Denoting 
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these estimates as y/w, 1 , 3 ^ 2 , •. - t yiu,k, we have 

1 * 

vw (?.«) = -X (n.,i - y» ) 2 as an estimate of V(y s ,), 

* i=\ 

The different squares in the expression for vhs.k ( yst) are not independent and vhsk 
(y, t ) is only an approximation to v(y s t). 

The variability among half sample estimates of variance arises from the 
between-strata contributions to these estimates in the form of cross product 
terms involving dhdj. But if we take into account the entire set of 2^ half 
samples, these cross product terms cancel one another. Now a pertinent 
question arises whether it is possible to choose a relatively small sub set of half 
sampes such that the cross product terms will disappear. As we shall verify in 
the following that such a possibility does exist. For example, consider three 
strata situation with observations (yn, 3 i 2 ), (y 2 i, 3 ^ 22 ) and ( 331 , 332 ). 

There are in all 2 3 = 8 half samples. Let us choose a sub set of 4 
replications as given below: 


Half-sample 

Stratum 

1 2 

3 

yhs, i - yst replications 

1 

yn 

y2i 

y3i 

Y 2 (Wicfi + \N2d2 + Wzds) 

2 

yii 

y22 

y32 

Y 2 (Widi - W2C/2 - Wzdz) 

3 

yi 2 

yn 

y3i 

Y 2 (-W'\d\-W 2 d 2 +W 3 dz) 

4 

yi 2 

y2i 

yzi 

Y 2 (“ Wicfi + W 2 d2 - W3C/3) 


Squaring and adding these four deviations we have the resulting 
expression given by Wx\d\ 2 IA) + W 2 \d 2 2 ! 4) + W 3 2 (d 3 2 /4), which is the estimate 
of variance. It may be checked that the matrix of ‘+ V and ‘ - 1.’ in the 
deviations are orthogonal to one another, causing the cancellation of cross 
product terms. Let us now identify the balanced set of half replicates with 
matrix of plus one (+ 1) and minus one (-1), where plus one (+ 1) indicates the 
presence of yh\ and minus one (-1) for 3 ^ 2 , as given below. 


+1 

+1 

+1 

+1 

-1 

-1 

-1 

-1 

+1 

-1 

+1 

-1 


It may be observed that each of the two elements in a stratum appears in half the 
samples. 
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H* in order to obtain a set ^ oZZ«Zl 

•cross products balance^ns necessary ^d^hog^nai ^ on£ another , Such , 

:l“ °h s b e^ribed by Plankett and Bunnan (.946) for obtaining W 
orthogonal matrices where* is a multiple of 4. With H strata a balanced se of * 
replicates can be formeo by taking a*x* orthogonal mat™ where* is amulttple 
of 4 such that *>//>(* - 4). In case k> H, the requtred number of unwanted 
columns are deleted from the matrix. The condition of balance is that the 
columns of the matrix are orthogonal to one another. Such a balancing leads to 
v„(y,,) = v(y«), signifying thereby that the * half samples contain all of the 
information regarding V(y sl ) contained in all 2 H half samples. The cross stratum 
component of v*(y«) vanishes because of orthogonality. Another feature of 

balancing emerges when the average of ally*,,a (a = 1 .*) equals y,i, which 

can be achieved whenever sum of the elements in a column corresponding to 
each stratum in the matrix is zero. If both the features of balancing exist in a set 


of replicates, it is said to have full orthogonal balance, which will be the case 
when k is a multiple of 4 and is greater than H. It is desirable to choose k as small 
as possible to minimize the number of computations. 

The main use of BRR lies in estimating variance of non-linear estimators 


for which simple and unbiased variance estimator are not generally available. 
Empirical investigations carried out by McCarthy (1969), Frankel (1971), Kish 
and Frankel (1970) and Mulry and Wolter (1981) suggest that balanced half 
sample replicates provide satisfactory estimates of the variance (or MSE). 
Krewski and Rao (1981) have studied the asymptotic consistency of variance 
estimators from the jackknife, the linearization and the BRR methods in respect 
of non-linear statistics such as ratios, regression and correlation coefficients. 
Rao and Wu (1988) have made second order asymptotic comparisons of these 
variance estimators. While Plackett and Burman’s method was based upon 
Hadamard matrices to construct balanced half samples, Gurney and Jewett 
(1975) utilised fractional factorials for the same purpose. Gupta and Nigam 
(1987) extends BRR method to general sample design with arbitrary number of 
selections from each stratum. They have shown that mixed orthogonal arrays of 
strength two, or equivalently, equal frequency orthogonal main effect plans for 
symmetrical factorials give a set of balanced sub samples useful in variance 
estimation. 

Although full balance can be achieved with at most H + 3 replicates, the 
amount of computation in BRR becomes heavy and expensive if there are many 
strata. This difficulty may be overcome by the strategy suggested by McCarthy 
(1966,1969), which runs as follows. Divide the whole universe of H strata into 
G sub-universes of H/G strata each. Next, form fully balanced sets of 
half samples separately for each of the G sub-universe groups. Combining the 
fully balanced sets for each group of strata, we get a consolidated set of half 
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samples, termed a partially balanced design. It may be pointed out here that the 
partially balanced set ( PB ) is not a fully balanced set, although each of its G 
components is. The question as to the best choice of G has been discussed by 
Lee (1972) who has shown that a partially balanced design would generally 
produce a replicated half sample variance estimator with lower precision 
compared to one provided by the full balanced design but the loss in precision is 
substantially reduced if the H strata are in Semi-Ascending order Arrangement 
(SAOA) before the needed partially balanced design is applied. 


11.6 Bootstrap Method 

The bootstrap method put forward by Efron (1979) is conceptually the simplest 
of all the techniques to estimate the bias and variance of an estimator. Let* = 
(xi, . . ., x„) be a sample of independent and identically distributed random 
variables with common distribution function F(x ) and 0 be the unknown 
parameter of interest, which is proposed to be estimated by t = t(x \,..., x n ). Let 
F n (x) be the empirical distribution function ofx with probability or mass 1 In at 
each point xi, . . ., x n . The bootstrap method consists in approximating the 
sampling distribution of t under F(x) by the bootstrap distribution of t * based 
on a random sample of size/? for F„(x) givenx* = (xi*,..., x„*). Efron (1979) 
suggests the following Montecarlo approximation. 


1 . 


2 . 

3. 

4. 


Construct an empirical probability distribution F n (x ) from the sample 
placing mass Mn at each point xi, . . x„. The empirical distribution 

function F n {x) is the non-parametric maximum likelihood estimate of the 
population distribution function F(x). 


Draw a simple random sample (with replacement) fromF(x) and denote 
this resample by xa*. 

Calculate the statistic of interest / from the resample, resulting in^*. 
Repeat ste p s 2 and 3 a large number of times, say B times. Construct a 

masO/T, k * i0n ib = . 1 " - B) by P' aci "g * probability or 

ass 1/Bat each point/, ,/ 2 ./« .thus generating a bootstrap estimate 

of the sampling distribution of t. * 


_ t . t N0W ’ each tb ‘ Provides an independent bootstrap replication of the 
tic of interest /. The bias and variance estimates of / are, thus given by 


(Bias)«= 7* -/,?= -L V /J 

B t‘i 

(Var)s = Var (/*) = —V ((• _ /• 
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As B -> oo, Var(/*) -» W (/), or a (/*) -» a (/)■ 

For most situations, 5 in the range of 50 to 200 is quite adequate for estimate of 
standard error. 

Thus, the central theme behind bootstrapping is that the sample is a good 
representative of the population and the relative frequency distribution of/ 
calculated from the resamples is an estimate of the sampling distribution of t. 

Application of bootstrap method to finite population situation has been 
dealt by Bickel and Freedman (1984) and Chao and Lo (1985). 

Kovar, Rao and Wu (1988) have compared resampling methods such as 
the bootstrap, the jackknife and balanced repeated replication with the 
traditional linearization method for non-linear statistics in context of sampling 
from stratified populations. 


NUMERICAL EXAMPLES 

Example 11.1: A sample survey for the study of yield and cultivation practices 
of guava was conducted in District Allahabad (India) during 1971-72. Out of 
total of 146 guava growing villages in Phulpur-Saran Tehsil, 12 were selected 
by the method of simple random sampling. The data for the total number of 
guava trees and area under Orchards for the 12 selected villages are in Table 
11 . 1 . 

Estimate the variance of the ratio of the total number of guava trees to area 
under orchards by (a) Taylor’s linearization technique (b) method of random 
groups (c) Jackknife technique (d) Bootstrap technique. 


Table 11.1 Total number of guava trees and area under guava orchards in se¬ 
lected villages (Singh and Chaudhary, 1986). 


SI. of village 

Total number of 
guava trees (y) 

Area under guava 
orchards in acres (x) 

1 

492 

4.80 

2 

1008 

5.99 

3 

714 

4.27 

4 

1265 

8.43 

5 

1889 

14.39 

6 

784 

6.53 

7 

294 

1.88 

8 

798 

6.35 

9 

780 

6.58 

10 

619 

9.18 

11 

403 

2.00 

12 

467 

2.20 
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Solution : The estimate of ratio R is 


/V 


R = 


YjX 


9513 

—— = 131033 
7260 


The estimates of variance of by different methods are computed as follows: 
a) Taylor’s linearization technique 


v,W = 


UN-n) 

1 

Nn 



I O';-fa/) 2 


X 2 


b) Method of random groups 

Assume the sample to be divided intofc(= 4) random groups. 

A 

Ly, Lx and R for 4 random groups are as given below: 



Lx 

A 

Ri 

2214 

15.06 

147.01 

3938 

29.35 

134.17 

1872 

14.81 

126.40 

1489 

13.38 

111.28 


R = ~Y j R i =129.72 


41 


i= 1 


Hence, 


v 2 (R)= 


1 

k{k- 1) 



.(66MMQ 

(4x3) 


c) Jackknife technique 

The psuedo values for the random groups in (b) are 
R' a =kR-(k- l)^(a), (a = 1,. .., k\ 

A 

where/?( a ) is the estimator of R after omitting ath random group, 
fl'i = 143.5674,^ = 137.4148, 

R' 3 = 127.4597 and = 117.6357 

R' = '/ (526.0776) = 131.5194 
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II 

rn 

> 

1 


(389.1482) _ 324 29o 


r 

1 

(4x3) 


d) Bootstrap technique 




No. of replications (8) 

Bias 

v 4 (R f ) 




1.1241 

65.2591 


500 


1.5222 

108.7869 


1,000 


0.5634 

102.0416 


5,000 


0.6219 

104.0859 


10,000 


0.6464 

103.8857 



EXERCISES 


11.1 Consider the following bootstrap method where a SRSWOR of size n is 
selected from a finite population having N units. Assume N/n = k, an 
integer. 

(a) From the sample X= {X \,..., X„) construct a bootstrap population If 
that repeats the sample k times, i.e. contains k of the X\ ’s, k of then’s, etc. 
thus If having N elements, (b) With k fixed draw a SRSWOR bootstrap 
sample of size n, say A* = (Xf, X 2 \ . . ., Xf) (c) Compute 


X* = 


f n ^ 

V/=i ) 


in. 


-here,- 

1 

v N J (N - \)n 



X(X, -X*) 2 (Chao and 

Lo, 1985). 

11.2 Divide the SRSWR of size n into k groups of size m , say. r\ =mk. 

Let y\, yz,..yk be the means of sub-groups. 

Le ‘ 

K j = 1 


vi = Est. V(y)= 


1 


V2 = Est. V(y)= 


k(k- 1)7 V ^ 
1 


Z(yj -y) 2 , 


n( n - 1) I (' V/ " using com P*ete sample. 


Show that K(vi) - V(v 2 ) = _ 2cj2 (n-k) 

n 2 (n-\)(k-\)' 


which is positive for« > k. 
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11.3 Show that estimators based on replicated samples of unequal sizes are 
more efficient than those with equal sizes (Koop, 1967). 

11.4 Suggest an estimate of regression coefficient based on stratified random 
sampling. Using Taylor’s series expansion derive its approximate mean 
square error and its estimate. 

H 5 L e t x be the characteristic of interest. Suppose we are interested in 
estimating the geometric mean# = (X\,Xi, .. Xn) xin ,Xi > 0. 

Suggest an appropriate estimate of0 and derive its Taylor series estimator 
of variance. 

11.6 Let 6] and 02 denote two estimators of 0 whose biases factor as 
£(0i) = 0 +/i(«) a{6) 


£( 02 ) = 0 +fM a{6), 


with 


1 1 

f\M f 2 (n) 


* 0 


Show that the generalized jackknife 


G = 


0i 0 2 

f(n) fiM 


AM 


1 

fiM 


is exactly unbiased for estimating0 (Schucany, Gray and Owen, 1971). 

11.7 Discuss the effect of jackknifing on ratio method of estimation based on 
interpenetrating sub-samples in terms of reducing bias and mean square 
error.(Ghosh and Gomez, 1986). 

11.8 Prove that the expected number of distinct units £(v) in k equal sized 
interpenetrating sub-samples drawn with SRSWOR from a popula- tion 
of N units is given by 


E(v)=N 



*0 

k ' 

1- 

1— 



v k) 



where ri is the over all sample size (Singh and Bansal, 1975). 

11.9 Let N = nk, where k is the sampling interval for drawing a systematic 
sample of size n. Out of k possible systematic samples p of them are 
selected without replacement by choosing/? different random starts from 
1 to k. Let y r be the sample total of the observations of therth systematic 
sample (called sub sample); r - 1,...,/?. We have now 

tr = y\+yi + . +y r -\ + (k-r + \)y r 

t = - (/! + ... + /,). 

P 


Show that t is an unbiased estimator of the population total Y. 





356 | Finite Population Sampling 

Find the variance oft and its unbiased estimator. (Roy and Singh, 1973) 
HI OExtend the balanced half sample technique to multistage sampling of two 
first stage units from each stratum of fist stage units. 
ll.llShow that the set of all 2 H possible half samples is always in full 

orthogonal balance. 

11.12Construct a set of half samples in full orthogonal balance when// — 6, 7 
and 8. 

1 l.IBConsider a half sample replicate in a stratified random sampling with// 
strata. For a sample of K HS replicates show that 

1 k 

o v k (y s ,) = - Z - y*> )2 1S unbiased - 

^ a = 1 


ii) var(v* (y„ ))>var (v(y a )). 



12. NON'SAMPLING ERRORS 



12.1 Introduction 

Non-sampling errors are errors creeping into sample estimates, arising out of 
factors other than the sampling process. As such both the census estimates and 
sample estimates are likely to be subjected to these errors. Although, in general, 
the sampling error decreases with the increase in sample size, the non-sampling 
error has a tendency to increase with the increase in the sample size. The 
computation of sampling error of an estimate presupposes either absence or 
negligible component of non-sampling error in the estimate and the statistical 
inference in sampling from finite population becomes more complex when the 
presence of non-sampling error can not be overlooked. 

Non-sampling errors can occur at every stage of the statistical enquiry 
beginning from the planning till the preparation of final report. Broadly, 
non-sampling errors can be classified into three types: 

(a) Errors due to observations: measurement or observational errors due to 
faulty measuring devices, faulty questionnaires, wrong answers, 
interviewer’s bias, etc. 

(b) Errors due to non-observation : failure to measure or observe some of the 
units of the population which are in the sample. This may be due to 
non-coverage, defective frame, deaths, not-at-homes, unable to answer or 
refusal to answer questions etc. in case of human populations and due to 
destruction of crop for various reasons in case of crop surveys etc. 

(c) Errors in data processing such as editing, coding, tabulating, 
mathematical calculations etc and also errors committed during 
presentation and printing of tabulated results, graphs etc for the final 
report. 

The problems of non-sampling errors can be dealt effectively by the 
choice of suitable methods for their assessment and control prior to the start of 
the survey work and possibly, some built-in procedures are to be provided for 
measurement and control in the design. Besides, defining the population 
unambiguously and taking care in the construction of frame, the questionnaires 
or schedules should be designed with inclusion of certain items of information 
to provide a check on the quality of data to be collected. For example, in an 
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nc nn pxnenditure and saving are to be provided to 
ihecHhe income'figures. Such checks are called consistency checks Other 
methods of assessing and controlling non-sampling errors are to make a sample 
check at different stages of operation during the course of mvestiga ion and 
taking up pos, survey checks with the help of better framed investigators. 
Further, external checks may be carried out for a sample of units under study by 
checking the individual answers against records or documents of some kind. 
Such checks are called individual validity checks. For example, birth records 
can be used to check on the accuracy of age reporting and tax returns on the file 
to check income reporting. Response errors arising out of the attitude of the 
respondents towards the surveys, methods of interview, interviewer’s inability 
and lack of understanding of the problem to elicit relevant information and 
recall error are some of the errors which should not be lost sight off in a survey. 
The training and skill of interviewers play a dominant role in controlling some 
of these errors. The recall error is beyond the control of the respondents and to 
avoid such error the time gap between the reporting period and the date of the 
survey should be as close as possible. 

Non-response or non-observation of certain units selected for study can be 
controlled by method of call backs (making subsequent attempts to get desired 
information) or method of substitution (substitution by observation from 
similar units) or method of using auxiliary information to estimate the missing 
observations. The use of mathematical models with assumptions as to response 
mechanism and about relations between variables under study may be exploited 
to construct estimators to adjust for non-response. Data imputation 
methodology has been extensively dealt by Platek and Gray (1983). More 
reflections on the treatment of missing data are due to Dalenius (1983) Kalton 
(1983), Kalton and Kasprzyk (1986), Little (1988), Little and Rubin (1987) and 
Rubin (1987) among many others. The randomized response querry may be 
adopted to control false answers or refusals in case of eliciting information on 
sensitive issues. 

The data processing errors can be reduced by employing qualified and 
trained personnel for the purpose and through built-in tabulation checks and 
proper supervision. 


12.2 Response Errors 

Response errors or in broad sense measurement errors occur at different stages 
of collection of data and is defined as the difference between the recorded value 
of the measurement of response and the true value. Response errors in sample 
survey may arise because of the faulty measuring devices, effects of 
enumerators, imperfections in the construction of questionaires or the existence 
of variability in respondent responses under similar survey conditions. The 



Non-sampling errors | 359 


problem of measuring and controlling the response error assumes importance 
and calls for the development of a survey model. Indian Statistical Institute, 
Calcutta under the pionneering leadership of Professor P.C. Mahalanobis had 
recognised the importance of estimating effects of response errors 
(Mahalanobis, 1940,1941, 1944,1946) and hence the techniques for the use of 
interpenetrating sub-samples were floated. The U.S. Bureau of Census during 
the late forties also initiated post-enumeration surveys for the measurement of 
the effects of response errors in censuses and agriculture and business. Thus, 
during last fifty years some outstanding contributions in respect of 
measurement errors and estimation or response variance components have been 
made by Deming (1944) Hansen, Marks and Mauldin (1951), Sukhatme and 
Seth (1952), Hansen, Hurwitz and Bershad (1961), Fellegi (1964), Hansen, 
Hurwitz and Pritzker (1965), Bailar and Dalenius (1969) and many others. 

The U.S. Bureau of Census has suggested a model which helps 
decomposing the mean square error into components reflecting sampling 
variance, response variance and other related components. This is described as 
follows: 


Let there be a population of N units U i, ..., Uj ,.. Un■ To each unit Uj a value 
X, of the characteristic* is attached. It is desired to estimate X = Z X j / N. The 
suvey procedure is such that the process of recording response Tor any unit is 
repeatable. Now, consider T trials of simple random samples drawn from the 
population. At trial r (r = 1,.. .,7) suppose sample s{t) of n units are recorded as 
y\i,y 2 i,. ■ . ,ym . Assume the model 


yji Xj + eji , 


where X) is the true value and e Jt is error of measurement having frequency 
distribution with mean py and variance a/. The expected unit value over all 
trials that include unit j of the population is 

E( yil \i=j) = Yj=Xj + Vj, 

P/ representing the bias in the measurement. Define 


7 _! y 

s(t ) 2 * 

n J es(t) 



to be average of the expected values of n units included in the rth trial. 

The difference between the observed value of they'th unit in the rth trial 
from the expected value for that unit over all trials is 


d Jt=yjt-Yj, 

. which is also is the response deviation for theyth unit. Let 

_ 1 



360 | Finite Population Sampling 

i *• fr%r thp units in thcrth trisl. Then 
be the average of the observations for the units in in 

d,=y t - Y S (t) 

is the average of the response deviations from the sample. 

M an estimate of!, * has an erronvhich may be measured by 

MSE (v,) = E(Vt-X) 2 , 


where the expectation is over all trials and samples. 

Hansen, Hurwitz and Bershad (1961) have decomposed 

MSEC*) = E(y t - Ys«)) 2 + E(Y s{t )- Y) 2 + _ 

2E (y<-Y s(t) ) (Ys(x)-Y) + (Y-X) 2 
= Response variance + sampling variance + covariance of 
response and sampling deviations + square of the bias. 


Now, the response variance 



E(d\\ n i 

= -^X + — E(d jt d kt \ j*k 
n n 

= — + — PrfCtJ = — [\+(n-l)p d ] 
n n n 

where the first term is referred to as the simple response variance and the 
second is the correlated component of the total response variance; p/ is the 
average intra-sample correlation coefficient. 

Two fundamental methods are available (Mahalanobis, 1946) for 
estimating response variance components - (i) Method of replication and (ii) 
Method of interpenetration. 

In the first method same sample is used in more than one trial, permitting 
unit by unit comparison. In the second method different samples are used in 
trials and we shall then have to be satisfied with sample by sample comparison. 
In both the methods either same set of investigators or different set of 
investigators are employed. Further one may also visualize a mixture of two 
methods - replication and interpenetration. If only component of interest is the 
simple response variance, the sample needs to be replicated, that is observing at 
least two measurements on the same unit. To measure the inter viewer effects, 
the inter penetrating sub-samples seem to be desired one. To know more details 
as to response variance estimation the reader may refer to Fellegi (1964) in 
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connection with 1961 Canadian census and Bailar and Dalenius (1969). 

Sukhatme and Seth (1952) have proposed a general model for measuring 
observational errors, given by 

yijk = x, + qlj + 5 y + e ljk , 

wher cyyk (z = 1,.. h;j = 1,.. nr, k = 0,1,.. ny), denote the value reported 
by the yth enumerator on the zth unit for the k th occasion, xi denote the true 
value of the character on the zth unit in a simple random sample of/? units drawn 
from N units, ay represents the bias of the /th enumerator in repeatd 
observations on all units, S,y denotes the interaction of theyth enumerator with 
zth unit and eyk is the deviation from*, + ay + by when the j th enumerator reports 
on the zth unit on the Ath occasion. 

Battese, Fuller and Hickman (1976) have considered estimation of 
response variances from interview- reinterview surveys. 

Now consider a simple model with 8,y = 0, that is, yy-xi + ay + ey, i = 1,.. 
., h;j = 1,. .., m with assumptions E(e tJ \ ij ) = 0, 

E(e 2 ij) = S e 2 , cov (ey t e vr ) = 0 , i * i' ,j *f 

Denote ny as the number of observations made by yth enumerator on the zth 

unit. £«#/ = 1 or 0. Let n L = Z ny = number of observatins on the zth unit and 

j J 

n.j= Iny = number of observations made by they'th enumerator. 

i 


Let y j be the mean based on nj observations made by theyth emunerator 

h m 

and y . be the mean of all the n = ^ ^ n y observations in the sample. 

m y=i 


Assume that 


i) the m enumerators are a simple random sample out of the population ofM 
enumerators 

•i) the h units of the sample are randomly allotted to different enumerators in 
the sample 

m ) n -j = n/m = n, that is 5 each enumerator makes an equal number of 
observations. 

lv ) n, = nth =p, that is equal number of observations is made on each unit in 
the sample. 

v ) /V is the total number of units in the population. Under the assumed model 
and subsequent assumptions we have 
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z, , i h m 

-y =ly^+-Ia / +-SI 

hfz 1 m J ' fii-u-i 


We have 


1 N 1 M _ 

£ (?/)=^L Jt ' + I7Z a ' =tl + a 

1 M j =1 

£(y.) = p + a 

where p is the population mean of true values*/ and oc is the population mean of 
enumerator’s biases. 

It may be seen that y. is a biased estimator of population mean unlesscx’s 
vary in such a way that equals zero. We find 


KCP/) = u vr x ^ m 


where 


and 


1 N 

s x =—y (*i -p ) 2 

j M 

S- = ——j ^ (a y - a) 2 ; jV and M are large. 


If /V and M are large, 


r// _ , SI Si S 2 e 
V(y ) = — + ——+ —- 

h m n 


When p= 1, n = hp = h and 


v(y..)= 


Sj+Sl Sj _ 5}+S}+Si 


m 


1 1' 


\m hJ 


Si (\ 1 


_y 


\~in~~h) S where s y 2 = s x 2 + Se 2 + S a 2 


Denoting p as the correlation between responses obtained by the same 
enumerator, we have 
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For large N and M, 


pS y 2 = 5« 2 


Thus, 


v(y.. 


(Sj] 


“ 

(h M 


(SI] 



1 + p 

- 1 

= 


l h ) 



\m ) _ 


l h ) 


[1 + p(w-l)]. 


This shows that the sampling variance of the estimator is inflated by the 
variability in the biases of the enumerators. Define 


,2 _ 


m 


1 m 
“ 1 7 = 1 


i m n 

4 =-S £ (yij - y.j ) 2 

/' =1 f-l 


and 



1 


/z-1 



where s<T is the mean square between the means ofm enumerators, s w 2 is the 
mean square between observations within enumerators and^ w 2 is the mean 
square between unit means. We find, 


E(sj )= 


m(m- p) 
hp (m - 1) 


Sl+Sl 


hp 


E(s w 2 ) - Sx 2 + Se 2 

£(4 ) = S,l + — 

P 


h(m-p) s2 
(h-p)mp a 


These equations not linearly independent, provide estimates ofS* 2 , S a 2 and S e 2 
and hence of V(y..). 

If p = 1 



f-l 

Kh) 


{Sx 2 + Se 2 ) + So 2 


E(s w 2 ) = Sx 2 + S e 2 

E(si)=s;+s 2 + -E-—si 

h -1 m 

These give 

i) s w 2 as an unbiased estimator of S'* 2 + S e 2 
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ii) (h - 1) J M 2 = (m-\)-s c 2 + (h- m) s w 2 , and 

iii) Est. S 2 =S 2 - 
For large N and M, 


m 


m 

— is 
Kh. 


W * 


si s 2 


Est. V(y )=^- = 2L + J!-?L( s 2-sl) 
m h (m-l)A v “ ’ 


Thus, it is seen that s u 2 /h does not give an unbiased estimator of variance of 
estimated mean, but is inflated by [(h - m)l{m - 1 )h\ (s u 2 - s w 2 ). 


12.3 Errors Due to Non-response 

Non-response may be defined as a failure to obtain information from units 
legitimately included in the sample. 

Some amount of non-response may not be always avoided in any large 
scale survey either based on census or a sample. This affects the estimates by 
introducing both a possible bias and increase in sampling variance. Thus 
non-response is regarded as an important measure of quality data and the survey 
sampler is required professionally to see that non-response is either completely 
eliminated or reduced to an acceptable level. The potential sources of causes of 
non-response may be basically either due to contents of surveys or method of 
collection of data or due to respondents. In any survey there may be 
non-response due to different causes requiring different types of treatment for 
their control. For example, in a household survey non-response might occur due 
to temporary absence of the informant, refusal, non-availability of the 
interviewer, unfavourable climatic factors, operational difficulties, time and 
cost constraints, lack of cooperation from the respondents, the inability of the 
interviewer to follow up the cases of temporary absentees, etc. Refusal rates can 
be substantially reduced by the proper motivation of respondents cultivating 
good interviewer-respondent relations. This necessitates proper training of the 
interviewers in the public relations or motivational aspect of their job. The 
choice of respondent and the respondent burden play a significant role in 
affecting non-response. To obtain data from each individual (non-proxy) may 
be time-consuming and more expensive. On the other hand proxy reporting 
may result in higher response rates and a possible reduction in respondent 
burden and cost of data collection. Therefore, the choice of suitable proxy 
respondents may be taken as an effective means of lowering the non-response 
rate over and above the reduction in cost of survey. 
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12.3.1 Bias Due to Non-response 

Suppose an SRSWOR s of size n has been selected from a population to 
measure the characteristicy. Further, suppose qk is the probability that^th unit 
responds. Define the sample mean as 

N N 

y= 'LhYki'Z 1 * - 

k = 1 k = \ 

where 4 is an indicator function which takes the value 1 ifkxh unit is in the 
sample and also responds and zero otherwise. Then £(/*) = (nlN)q k , assuming 
that the events that Ath unit in the sample andfah unit responds are independent. 
Thus, 

E(y)=E E £/* 

j= i _]/ L** 1 

N „ /[„ N 1 N N 

= Z = £<?* 

k= 1 N k=l J k k = \ 

The approximate bias V -— Y k becomes negligilefor^^ =11 q k IN. 

LI qk N J 

Even if the bias becomes negligible, V(y) is no doubt inflated because of 
the reduced size of the sample due to non-response. Thus, non-response not 
only induces bias, but also inflates the variance. 

12.3.2 Sub-sampling of Non-respondents 

If the first attempt to collect information from the sampled units results in 
incomplete samples or non-response, Hansen and Hurwitz (1946) suggested a 
call-back technique to sub-sample the non-respondents and use some extra 
effort to collect information from every one in the sub-sample. Deming (1953) 
has developed a model for the study of different call back strategies. In this 
regard references may also be made to Durbin (1954) and Durbin and Stuart 
(1954). 

Suppose that the population of N units is visualized as consisting of two 
classes-response class and non-response class of sizes Wi and N 2 respectively. 
As such N = N\ + N 2 . In an SRSWOR of size n , suppose n\ units responded and 
«2 units did not respond (n = n\ + m). Out of «2 units, who did not respond, 
select again an SRSWOR of size 1*2 to be covered with intensive efforts. Let «2 = 
^ 2 , (k> 1). Let;; be the character under study and y\, >2 be the sample means 
based on n\ and n units belonging to response and non-response classes 
respectively. 

As estimate of the population mean Y is given by 
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(n i y\ + n 2 y r2 ) 


7 =' 


/z 


£(7) = EE [/ | «i, «2] - E{y) y 

l/(y) = V[E(y | m, m)] + E[V{y | m, m] 


= V(y) + E 


V(hr) 

v n S 


— - — 1 S 2 +E 


«2_ 

« 


J__1 

r 2 n 2 J 


S\ 



N> 


A -1 





where £(« 2 )= n(NiJN) = «^ 2 , ^2 = AVjV, S 2 is the variance of the whole 
population andS? 2 is the variance for N 2 non-response group in the population. 

The first term is the variance that would be obtained if all m units are 
observed and the second term is the increase in variance from sampling onlyz -2 

OUt Of A72- 

As n and k are at our choice, we may like to obtain optimum values oin 
and k by minimizing average cost for a pre-assigned value of the expected 
variance. 

The average cost is represented by 
E(C) = E[con + c\n\ + cirj\ 

- , 117 if c 2 W 2 n\ 

- con + c\ W\n + —— J 5 

where W i = N,/N , co is the cost of making the first attempt, c\ is the cost of 
• processing data obtained from the first attempt and ci is the cost of collecting 
and processing data in the second attempt from the non-response class 
The optimum solutions are 

* _ l c 2 (S’--IV 2 S?) 

K °pt “ J~7- 

V SUco+CtW,) 

and 


^opt 


= [ 2 +{k-\)W 2 Sl 
fo + S 2 /TV 

where V 0 is the pre-assigned value for the 

fYloon 


mean. 


variance of the estimated population 
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The optimum values of* and n are dependent on knowledge of W 2 , S 2 and 
S 2 2 , which are usually not known and hence must be estimated from the 
previous experience. Of course S 2 2 is harder to predict compared to S 2 and 
moreover may not be the same as S 2 . When S 2 2 = S 2 , k may assume a value less 
than 1, unless the cost of collecting data at the second attempt is considerably 
more than that of collecting at the first attempt. Extensions of Hansen and 
Hurwitz’s technique are due to El-Badry (1956), Foradori (1961). The reader 
may also refer to Deming (1953), Dalenius (1955), Madow and Olkin (1983) 
and Srinath (1971) for more research work on incomplete samples. 

12.3.3 Politz-Simmons Technique 

A weighted technique to reduce bias at the first call was suggested by Politz and 
Simmons (1949, 1950). Suppose the interviews are conducted during evening 
hours on six week days. It is assumed that the interviewer calls on the 
respondents randomly during interviewing hours. If the respondent is found at 
home, the required information is collected from him along with the informa¬ 
tion as to how many days he was available during interviewing hours on 
preceding five days. This information is noted to determine the probability of 
respondent’s availability. In some surveys Polit-Simmons technique is more 
cost efficient than repeated call backs. P.S.R.S. Rao (1983) has elaborated these 
approaches in more details. 

Suppose an SRSWR of size n respondents are selected for interviewing. 
Let Pj be the probability that the z'th respondent is available at the first call 
during specified hours. 

If the z'th respondent was available at home during last / evenings out of 
five, the estimate ofP, is given by 

Pi > t 0, 1,2,..5. 

6 

As an estimator Y, we may have 


y ps — 


n M Pi 


y< takes the value zero if the respondent is not available at the time of first call. 

Let/?,, be the probability that z'th respondent is available t times out of five 
previous evenings. We have, 



f 5 l 


Now 


p; v-Pi) 5 " 
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\Pi ' 


-*I 7 

t = o 1 


+ 1 


Yi 


W 


M 


The probability that rth unit is selected and is found at home is PM Thus, 
E(yp\) -EE [y P s I t] 


1 n 

=-eY e 

n i=\ 


XL 

Pi 






where Qi=\- P,. Hence, y ps is a biased estimator of and the bias is negligible if 
the probability that a respondent is never found at home is very small, 

• , ,v 

Nfc t 


y(y„)=- 


n 


A N 

te* 1 - 


where 


5 i ( 6 > 

Ti = Y - P t+] (1 _P.) 6 -(/+D 

,“o f + lU + U ' 


Es t.Y(y P s)= 


1 


I L-y, 




v Pi 


The Politz-Simmons technique becomes advantageous when call-backs 
are not feasible. This technique may turn out to be less efficient when the first 
call results in an arrangement such that it becomes possible to enumerate all the 
respondents at the second call. 

12.3.4 Warner's Randomized Response Technique 

Sometimes in a survey one may like to have answers on some sensitive issues 

such as alcoholism, gambling, veneral disease, political preference, etc. in a 

community. The respondents invariably feel shy or even refuse to reply on 

direct questioning. Even if they answer, there is likelihood of giving false 

answers. Warner (1965) advocated a randomized response (RR) query by 

protecting the privacy of respondents in place of direct response (DR) query 

regarding a qualitative stigmatized character A for which response may be 

either ‘yes’ or ‘no’. He named his technique as RR technique because the 

respondent answers to a question which he selectes at random from one or more 

questions without the knowledge of the interviewer as to which question he is 
answering. ^ 
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Let us suppose we have an SRSWR sample s of size n taken from a 
community having N persons to estimate the proportion tia of stigmatized 
character A in the population. Denote 1- tia as the proportion of A c , the 
complement of A. In stead of putting a direct question to the respondent 
‘whether you belong to A’, let us ask the sampled person the same question with 
probability P, where P is suitably chosen and its negation with probability 
\ - P. The respondent gives ‘yes’ or ‘no’ answer to the question put through the 
random device, hopefully in a truthful manner, assuming that the interviewer 
does not know which of the two complementary questions he is answering. 
According to Warner, each interviewer is supplied with an identical spinner, a 
RR device with the face marked so that the spinner points to the letterwith 
probability P and to the letter A c with probability (1 -P). The respondent is 
directed to spin the spinner unobserved by the interviewer and reports ‘yes’ if 
the spinner points to the letter representing the group to which the interviewee 
belongs and ‘no’ otherwise. 

Thus, in a RR procedure the proportion of ‘yes’ answers p in a SRSWR 
of size n estimates the probability of‘yes’ answers 


n = Ptia + (1 - P) (1- n A ) = (1 — P) + (2P - 1) n A 

Equating (1 -P) + (2 P- 1) ua top* and solving for ua gives the estimate^ of 
tli. That is, 

. {p'-(i-p)} 

n a ~ -—, provided P* l /. 

(2P-1) /2 


tia turns out to be the maximum likelihood estimate of 7 i a- 


v (*a H 


1 


(2/>-l) 2 


\V{P) 


n* (l-7t*) _ 

"tt A (1-JI.4 )' 

, f PQ-P) 1 

n (2P-1) 2 

n J 

jn(2/>-l) 2 


Thus, the increase in variance due toRR compared to direct questioning is 
P(\- P)/n(2P- l) 2 , which decreases as P approaches either 0 or 1. Further, the 
loss in efficiency tends to increase as P approaches 1/2, an ideal value of P 
acceptable to the respondent for the sake of protecting his privacy. The 
interviewer, on his part, may like to make/ 5 as close to 0 or 1 as possible without 
threatening the privacy of the respondents. 

An unbiased estimator of V(ka) is given by 



(n-l)(2/>-l) 2 


1 

(n-1) 



O-tt.4 )+ 


P(l-P) 

(2P-1)2 
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rreenbero et al. (1969) have shown that in the case when the respondent is „ 01 
completely truthful in reporting,^ is biased and its mean square error» g iv 

by 


given 


MSE(7t/i) 


tt^9(l-7t^6) ^ P(}zfL + n 2 (Q -1)2, 


where 6 is the probability of telling the truth. Abul-Ela, Greenberg and Horvitz 
(1967), Mukherjee (1981), Bourke (1982), Silva (1983) and others have 
extended Warner’s model to estimate several sensitive proportions. 

Mangat and Singh (1990) have considered a new randomized response 
procedure with the help of two random devices. The random device^?i consists 
of two statements; namely, (i) ‘I belong to the sensitive group A' (ii) ‘Go to 
random device R 7 represented, with probabilities T and 1 - T respectively. The 
random device R 2 consists of two statements: (i) I belong to sensitive groups 
(ii) I do not belong to A , with known probabilities/* and 1 -P. The respondent is 
first instructed to use /?i. Depending on the outcome of/?i, he is to use R 2 . In 
case of outcome R] (i), R 2 (i) or R 2 (ii)th respondent is required to answer ‘yes’ 
or ‘no’ to the statement and actual status he possesses. Both the devices are 
performed, unobserved by the interviewer. If 7r** is the probability of ‘yes’ 
answer, 

n" = Tn A + (l-T) [Pn A + {\-P){\-n A )] 

The maximum likelihood estimate ofn A is turned out to be 

_ p*-(l-T)(l-P) 

A 2P-l+2T(l-P) 

where p is the proportion of ‘yes’ answers in a sample of n respondents 
selected by SRSWR. 

V(n' A ) = 71 " (1 ~^ ) | P-n (l-^)[l-(l-T)(l-/>)] 

« n[2P -\+2T (l-P )] 2 

An unbiased estimate ofV(n' A )is 

Est. V(n' A ) = — P ^ /[{2P-\)+2T(\-P)Y- 

n -1 /J 


It may be verified that Mangat-Singh’s estimator^ will be more efficientthan 
Warner’s estimator n A if T > (l-2/>)/(l-/>) The inequality shows that 
Mangat-Singh s strategy can always be made more efficient than that of 
Warner by a suitable choice of T for any practical value oiP. 

ized resnnn^ 30(1 S '? gh ^ 199,a ’ b) have also proposed an alternative random¬ 
ized response procedure for SRSWOR 

Mangat (1994) has suggested an alternative randomized response 
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strategy, which runs as follows Each nf« * • • 

if he has attribute A. If he does nnf hi ^ u “ mstructed t0 sa V ‘yes’ 

Warner’s device cnnQict' c Ve attr, ^ ute is required to use 

Warner s device consisting of two statements: (a) I belone to A Of) I do not 

have attribute A, represented by probabilities P and 1 - P restctwelv le 
probability of ‘yes’ answer is respectively. The 

a = n,4 + (1 -tt^Kl-P). 

The maximum likelihood estimator oftt/i is 

(a-l+P) 

P ' 

where a is the observed proportion of‘yes’ answers. 


n 


nP 


Est. T(tc a ) ^ (1 ~ a > 

(n - 1)P 2 

Warner’s tio b ir t rifled Manga, ’ S (1994) strate gy> is m <« efficient than 
Warner s (1965) strategy if 


K A > 1 - 


[(2P-1) 


which holds if P > y . 

12.3.5 The Unrelated Question Model 

If both /( and A, involve sensitive issues Horvitz, Shah, Simmons (1967) and 
Bernard G. Greenberg et al. (1969) have suggested a modification of Warner’s 
KR model by asking the respondent an unrelated question referred to a 
non-sensitive, non-stigmatized and innocuous attribute X along with the 
question on stigmatized character/!. Such a technique enhances the confidence 
ot the respondent in the anonymity. 

A sample of n persons is drawn from the population using SRSWR 
design. Each respondent is provided with a randomized response device 
consisting of two statements (a) I belong to group A (b) I belong to group X 
I he respondent selects randomly a statement from (a) and (b) with probabilities 
and (1 - P) respectively by the random device unobserved by the interviewer 

and reports‘yes’or‘no’. 

Assuming that the respondents reports complete truth, the proportion of 
yes’ answersp,/unbiasedly estimates the probability of‘yes’ answerrc* M = Pn A 
+ -P) n x . Since n x is supposed to be known, the estimate ofn A is given by 
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^ = {El _^^ provided P * 0. 

A P 


n A is unbiased and its variance is given by 


V&a) = 


K (1-71* ) 
nP 2 


The value of P should be chosen close to 1 as far as possible without threatening 
the degree of cooperation by the respondent. 

When nx is not known, draw two independent SRS WRs of sizes^i and n 2 
from the population. The two samples are associated with separate probabilities 
P\ and Pi to relate response to sensitive attribute A. If p*\ u and p*i u are two 
sample proportions of ‘yes’ answers and unbiasedly estimate 

7i* ih = P\n A + (1 - P\) nx and n* 2 U = Pin a + (1 - Pi) nx respectively. 


Thus, an unbiased estimator of nA is given by 


/V 




( 1-^2 )P\ U -(!-/>! )P2u 
Pl-P 2 


, provided Pi * P 2 . 


n* A ) = 


(1 ~P 2 ) 2 ) , (1-P, ) 2 n lu {\-n' 2u ) 


n , 


n- 


/W -Pi) 1 - 


An unbiased estimate ofK(it A ) is 


(1 ~ P 2 ) 2 P\ui\~ P\u) , ( 1 — ^1 ) 2 P 2U (\~ P 2U ) 

(P, -P 2 ) 2 («1 -1) (P, -P 2 ) 2 («2-l) 


To achieve high efficiency it is desired to choosePi * P 2 but close to )/» so as to 
induce a sense of protected privacy in a respondent and thereby enhancing 
prospects of trustworthy co-operation. P(n A ) involves five parameters n\, m. 
Pi, P 2 and nx which should be suitably chosen. Moor (1971) has suggested a 
step wise optimization procedure by first optimizing V(n A ) with respect to m 
and m and then with respect to Pi, Pi and nx. As the discussions on this are 
quite involved the reader may refer to Moor (1971) for details. 

Dowling and Shachtman (1975) have shown that the variance of the 
maximum likelihood estimator in each of the one and two-sample unrelated 
question randomized response models is less than that of Warner’s estimator 
uniformly in nA and nx provided P in single sample case and Max (Pi, Pi) in 
two sample case is greater than one-third (approximately). 

Bourke (1974) has extended usual R R technique to multi-proportions RR 
using the unrelated question technique by successfully blending together the 
unrelated questions to the sensitive characteristics. 
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12 . 3.6 An Alternative Unrelated Question Model 


Folsom et al. (1973) have developed a new randomized response design which 
uses the second sample more efficiently. LetJfi and X 2 be two non-sensitive 
alternate questions. The technique is to ask directly a second unrelated question 
( Xi ) in the first sample after using a randomizing device with questions^ and 
X\. Similarly, members of the second sample are required to respond to a 
randomizing device containing questions A and X 2 besides their direct 
responses to question X\. In both the samples the sensitive question A is 
selected with probability P. Let 71 */ denote the probability of ‘yes’ response to 
the question selected by the randomizing device in sample i and m d the 
probability of a ‘yes’ response to the direct question in sample/' (= 1, 2). 

Thus, we have the probabilities for two samples written as 

Sample 1 7t* 1 = Pka + (1 -P) nxi and n\ d = nx2, 

Sample 2 7C*2 = Pka +(\ - P) 71 x 2 and K 2 d = nx \, 

where tia = the proportion of the population with the sensitive attributed and 
nxi = the proportion of the population with non-sensitive attribute^ (/' = 1,2). 

Two unbiased estiamtes of7u can be obtained from observed frequencies 
p* 1 and p *2 of ‘yes’ responses in the two samples as 


( 2 ) 


p 


An optimum estimator ofjiA is obtained by forming a weighted estimator 
i AW = W7t^(l) + (1 - W) 7l/l(2), 
w is chosen so as to minimize V(ua w ). 

For more details as to Folsom et al.’s weighted estimator and its efficiency 
comparison with Moor’s optimized version of the standard two sample single 
unrelated question model and with the one sample single unrelated question 
model, the interested readers may refer to Folsom et al. (1973). 

12.3.7 Randomized Response Technique for a 
Quantitative Characteristic 


Let y be a sensitive characteristic, assumd to be quantitative in nature. For 
example, a sensitive question may be regarding personal income of res¬ 
pondents, bank balances etc. Greenberg et al. (1971) and Erikson (1973) have 
suggested models for estimating mean of a sensitive quantitative characteristic 
using the extension of the unrelated question model used for binary data. 
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„„„ has the probability function fly) while the 
Suppose the sens'tive tespon ? specified probability function g(x). 

non-sensitive response X P jtive question with probability^ 

Assume that the respondent ans ^. er n S r * e a ^ t ^ 1 ^.us, the distribution of 

and the non-sensitive question with probability (l n- 

the observed response z is given by 

h(z) = Pfiy) + 0 ~P)gM. 

which is a mixture of two distributions^) andg(x). The mean and variance of. 


are 


U r = P fiy + (1 - P) V-x- 


and 

0 .2 = a ,2 + p (a/ - a, 2 ) + P( 1 - F) (fly - V*) 2 - 

Therefore, the estimate of the mean Y is 

- [^-( 1 -^] 

Vy - p -’ 

fi z is calculated as sample mean of responses denoted by z. Now, 



The Additive Model 

The respondent is asked to sum his sensitive attribute^ and a random value* 
from a known distribution. The observed response is thereforez =y + *. The 
mean and variance ofz are 

H z = \i y + fix and CT Z 2 = o> 2 + c x 2 , 

ify and * are independent. Therefore, n y = \x z — fix, where \i z is estimated by the 
simple random sample mean responsez, and 



The Multiplicative Model 

The respondent is asked to multiply his T-value with a random value* from a 
known distribution. Thus, the observed responseZ is expressed as z -yx 


E(z ) = E(y) £(*), that is, p z = p* P*. 


Further, 


CT Z 2 = |ix 2 0> 2 + \iy 2 CJx 2 + Gy 2 G x 2 . 


The estimator of \i y is given by 
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A 




Alternatively, we may choose two independent SRSWRs of sizes/?, (/ = 1, 
2) and n = n\ + m. Respondent from each sample is aske’d to report a value Y say 
Y, with probability P, (i = 1,2) and a value say Xj on a non-sensitive character^ 
with complementary probability (1 - Pi) without the knowledge of the 
interviewer as to the variable on which he is reporting. Denote th qR R value 
reported as Zj, (/ = 1,2). We have 

Zji = P, Y p + (\ -Pi)Xj, 

Taking expectation, = Pj\i y + (1 - Pi) p*, i = 1,2 and P\ * Pi, then 

.. _[(l-/ 5 2)P Z l-(l-/ > ,)P 2 2] 

(A-Pi ) 


and is unbiasedly estimated by 

- [(1-/> 2 )Z,-(1-/>,)Z 2 ] 

(A-A) 


1 


Denoting 

Vji-z*) 2 ’ 

n i — 1 7 =1 

an unbiased estimator of V({iy) is given by 
= 


(l-/> 2 ) 2 ^ (1-P,) 2 s 

-— + - 


2 

z2 


«1 


« 2 


l{Px~P 2 ) 2 


A general linear design unbiased estimator of population total Y may be 
written in the form 

t(s,Y)=b S0 +'£b, i Y i , 

ies 

where the coefficients satisfy 

Ys b *iP( s ) =l > i - i - N > I**>/>(*)= ° 

S3i s 

If y is a sensitive character, the randomized response technique aims at 
estimating the true value Y, through the randomized response variable Z/ 
attached to the unit /. 
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rhnnse a large number of real variables*, (j - 1. M), not necessarily 

all of them are distinct. Suppose a samples is chosen with selection probability 

p(s) ' The sampled person i is asked to report the value Z, which is equal to Y, 
with pre-assigned probability c and equals^/ with probability qj (j — 1,.. ^ ^ 
such that 

.. M 

=1 

7 = 1 


Then, 


M 


E r (Z,. )= cY, + £ qjXj = cY i+ (1- c)p , 

7=1 


where Er is over RR and 




_( £?, Xj) 

(I?,-) 


Define 


D _{Z, -(l-c)p} 

--* 


Then Er (Ri) = Y, and 


_[c(l-c)* 2 -2c(l-c)p r f + I?,■*.? -(l-c) 2 p 2 1 


The estimator t*(s,R) - b so + X &,• /?, is an unbiased estimator of the 

ies 

population total Y. This follows from the fact that 

£[f*(s, «)] = E p Er [/• (s,R)] = E p [t(s, Y)] = Y 
F[r*(j,«)] = Ep V R [t'(s,/f)] + Vp Er [l\s,R)] 

= 2>WZ bl V R (/?,)+ V p [t(j, Y ) 

s ies C 

-W.i , )]+-j£rt*) £ bf.v R (z i ) 

C 5 [j €J 

Warner (1971) has established a general linear randomized response 

mo e estimates and variances obtained through analogy with usual linear 
regression models. 

Bellhouse (1980) has derived certain extensions for the situations where 
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the sensitive characteristic under randomization of response is subject to 
reporting or recording errors. Adhikary, Chaudhuri and Vijayan (1984) have 
derived optimum strategies in a more general set up involving randomized 
response techniques. For more details reference may be made to Chaudhuri and 
Mukherjee (1987). 


numerical examples 


Example 12.1: A survey for studying consumer preferences for certain items of 
daily consumption for the city dwellers was conducted by Mail with expected 
response rate of 60%. If the precision required is equivalent to that given by a 
simple random sample of size 500 without non-response. Determine how many 
questionnaries should be sent out by mail and what percentage of 
non-respondents should be interviewed? 

Given below the following unit costs in rupees: 

Co = cost of making first attempt = Rs. 51- 

C\ = cost of processing data obtained from the first attempt 
= Rs. 10/-, 

C 2 = cost of collecting and processing data obtained from 
interviewing in the second attempt = Rs. 30/-. 

Solution 


W i = 0.6, W 2 = 0.4 


Assuming the variances S 2 and S 2 2 to be equal and N to be large, 



CzO-K'a) 

c 0 +c, 


30(1-0.4) fl8 

—-- - = — = 12792 

5 + 10(0.6) V11 


n 


opt 


= S 2 


[\ + {k-\)W 2 \ 

r r ’ 


s 2 

where V =- 

500 


= 500 [1 + 0.2792 (0.4)] = 555.84 = 556. 

Example 12.2: Suppose among 1500 income tax payers we are interested to 
estimate the proportion of individuals resorting to wrong declaration of income 
As the enquiry is on a sensitive issue, attempts to carry out a direct response 

survey would naturally fail and we may adopt Warner’s Randomized Response 
model as follows: F 

Take a collection of 4 green balls and 6 red balls and put them in a bowl 

he respondent will randomly pick a ball from the bowl and answer the 
following question. 

Are you a tax offender?’, if the ball is red and Are 


you an honest tax 
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paver?’, if the ball is green. The interviewer is finally supplied with the number 
of‘yes’ answers without the knowledge as to which question is being answered 

by the respondent. 

Suppose in an SRSWRof75, respondents, 35 provided yes answers and 
40 ‘no’ answers. 

Estimate the proportion of tax offenders in the population along with its 
standard error. 

Solution : We have P = 0.6 and p* = 35/75 = 0.4667. 

The proportion of tax offenders 



(2P-1) 


[p'-(l-P)] 


(0.4667-0.4) 

( 2 ( 0 . 6 )-!} 


= 0.3335. 


Est. V(n A ) = 


P'Q-P*) 

[n-\){2P-\) 2 


0.4667(1-0.4667) 
(75-1)(0.2) 2 


0.0841 


Est. S.E. (n A ) = 0.29. 


EXERCISES 

12.1 An initial simple random sample of si zen is selected out of N units to 
observe character^. Suppose n\ out of n respond at the first attempt and «2 
do not respond. Select a predetermined sample of size 

n 2 

n 2 =---, where k! > 0, 

( k'n + n 2 ) 

from «2 non-respondents and completely enumerate it. Letyj and y'i be 
sample means based on sample n\ and sub-sample^' respectively. 

Show that y = (n ] y ] +n 2 y 2 )/n is an unbiased estimate of Y, with 
variance given by 

(N-n) , (k'S 2 ) 

K( v)=-^- 11s 2 +-- 2 — 

Nn n 

where S 2 is the over all variance ofy’s and S 2 2 is the variance corresponds 
to N 2 individuals belong to non-respondent group in the population 
(Srinath, 1971). How does it compare with Hansen-Hurwitz’s estimate 
with «2 =n 2 / k'l 

12.2 Suppose an initial random sample is selected from the population. The 
first mail attempt leads to, say, n\ responses and n \2 non-responses, A 
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sub-sample of size n'n = n\ilk\ is taken at random from nn 
non-respondents. A second mail attempt is made giving /12 responses and 
nn non-respondents. Again from the group of non-respondents a 
sub-sample of size n '22 = milki is taken and a third attempt is made and 
so an. The (Z, + l)th attempt is made on a sub sample of size/ 1 ' 1.2 = milh 
taken from mi non-respondents. It is assumed that there is no 
non-response to {L + 1) th attempt. Show that an unbiased estimator of Y is 
given by 



f V\ 

\n) 


[yi + k\y2 + k\k2yi + .. . + k\k2h ... ki yi+\] 


12.3 

12.4 


where y\, .. ., y/„+ 1 are the totals of the characteristic based on n\, m,.. . 
«/> 1 respondents respectively, k\, k 2 ,..., fa are constants greater than 1 , 
fixed in advance. Derive variance of y. 

Estimate the population mean in two stage sampling set up when 
non-respondents occur at each stage. 

Let*, (/ = 1,..., N) denote the true value and yt (/ = 1,.. N) denote the 
reported value for the ith unit. Assume thaty, is a random variable with 
as the expected value. In SRS WOR of size n, let the sample mean y n be an 
estimate of Xn. Show that y n is biased and its variance is given by 


1 

aj + 

f 1 -- 

Si 

n 

y 

NJ 



where 


and 




Sl = 


1 N 1 N 


N 


n{t, 


12.5 The population proportion P belonging to class A is estimated by drawing 
a sample of n units randomly from a population of N units. Some units 
were misclassified due to response errors. Assuming that response errors 

and units are uncorrelated, show that the variance of the sample 
proportion p is given by F 


n h N -1 nNfl t 


where P, is the probability that the unit U, is classified to the belonging 
class A, given ZP, = NP and Q,= \- P„ 

12.6 Suppose unknown to the interviewer the respondent reports the actual 
values only with probability C(,0 < C< 1) or alternatively would report a 
andom value from a given set of numbers*, V ...,X M that would cover the 
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whole range of possible values for the characteristic of the survey. 
Denote Y, for the characteristic value of the rth unit in the population and 
Z, as the response 

Y { with probability C 
X ; with probability (1- C) / M 

K •' 

while Y for the vector of population values and Z for the vector of 
responses. Conceptually the randomization can be done for every unit in 
the population before hand. Let Y— hYj. 

If the sample respodents report true values, denote e(s, Y) as an estimator 
which depends on Y only through units in the samples. 

Further, denote t(s, Z) as an estimator based on randomized responses. 


i i _ c M 1 

t(s, Z) = 1 e(s, Z)- —Y X J Z — 
C M / = 1 / es 7t i 


Prove that 

i) the derived estimator t(s, Z) is an unbiased estimator of Y, when 
responses are randomized. 

ii) any unbiased estimator's, Z) must necessarily be of the formas, Z) 

iii) a(s, Z) is not uniformly minimum variance. 

iv) derive the optimum properties of derived estimator t\s, Z) in the 
Horvitz-Thompson set up, defined by 


,*(,,z)=x|L, 

ies i 


1 _ r M 

where Z* = Z, - -—— V X { 


(Adhikary, Chaudhuri and Vijayan, 1984). 

12.7 From a finite population consisting of N units, a SRSWOR of n units are 
drawn without replacement. Itr (r- 0, 1,2,.2) denotes the number 
of units on which information could not be available due to random 
man-response. Derive the biases and mean square errors of the following 
estimators of finite population variance s y 2 , when information on x is 
available for all the n units and* is a suihably chosen constant. 

t\ = (V 2 /*/ 2 ) & 2 


*2 (,(sy 2 /s x 2 ) S x 2 + a —— l 

n O 5 


V 2 and i/J are based on units for which information are available 
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